<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:clearspace="http://www.jivesoftware.com/xmlns/clearspace/rss" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>Clearspace Server Syndication Feed</title>
    <link>http://www.enzeecommunity.com/blogs</link>
    <description>A syndication feed of all the blogs on this system</description>
    <pubDate>Sat, 28 Aug 2010 04:28:35 GMT</pubDate>
    <generator>Clearspace 2.5.3 (http://jivesoftware.com/products/clearspace/)</generator>
    <dc:date>2010-08-28T04:28:35Z</dc:date>
    <item>
      <title>Team Makeup</title>
      <link>http://www.enzeecommunity.com/blogs/grill/2010/08/28/team-makeup</link>
      <description>&lt;!-- [DocumentBodyStart:a694b939-9ab4-453a-8d4a-ff94f2ce267e] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;As some of you know, here in Texas the sport of football is a bit more than a team sport. In some locales, it's practically royalty. Back in high school and college, I was ever-aware of the sports fans that liked to wear the team colors. One in particular invited me to lunch while his parents were in town (I already knew them from back home) but I never knew that his father owned a car that was actually painted in Dallas Cowboys colors. Another colleague of mine was stuck in a parking garage with a dead battery as I was leaving work, and asked if I could help him with a jump. Upon saying yes, he produced a set of maroon-and-white jumper cables, honoring Texas A&amp;amp;M University. Oh yeah, he was a fan.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Then several of my friends kept large jars of team-color makeup and would smear it all over them before a game. No, not to go see the game at the stadium, but in their &lt;em&gt;apartment&lt;/em&gt;! Yes, the team makeup was always a big hit, and their parties were solely centered on the tiny color TV in the center of their rather modest apartment. I had once wondered what the people next door thought of their shouts, whoops and antics, until I learned that the folks next door were just as rowdy with their own team. Hey, ya gatta be a fan.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Okay that's not the team makeup I'm talking about here, but those were just such funny memories I thought I'd share. And to make a point of course. The enthusiasm of your developers and architects, and their desire to cheer you on with your goals to achieve, are directly dependent on the team makeup. Of people, that is. So what does it take to make a data warehouse? Or for that matter, what does it take to effectively roll out a new one, or migrate an old one?&lt;/p&gt;&lt;p&gt;&lt;br/&gt;What most people fail to estimate in taming such a beast, is the level of testing required to make it a reality. As an example, many of you must produce documents as part of your regular workday. Those documents are often hard to write, but even more work to proofread. In fact, proofreading is the same form of content-and-context testing we would do for a data warehouse. The chief reason is the product - information and knowledge. Business intelligence is the same way - it has a way of taking on a life of its own, but the only way we can reliably roll out a viable business intelligence platform, is to test, test and test some more. Eyeballs-to-page however, may be required for a book or document, but it won't scale for a warehouse.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Many people don't realize that the testing portion of the warehouse can take as much as 80 percent of the project's resources. While we can compress this somewhat with agile methods, we cannot afford to test such quantities of data with simplistic manual approaches. And by that I mean eye-ball examination or screen-shot testing. No, the majority of testing is in the data itself, and on a Netezza machine it's in the billions of records. Eyeballs don't have the bandwidth. We need to use the actual power of the machine to scale this mountain.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So what should the team itself look like? May I suggest that for every Architect you would have several developers, and for every developer you would have two or more testers. Ideally three testers for each developer, regardless of how many developers are actually doing the work. I will also suggest that you keep the total count of developers rather lean. Five is perhaps overkill for the back end. For the front end, three solid developers can be an army, and five is about the upper limit. The reason is simple: logistics. If you have five developers on the back end and five on the front end, and three testers in the wake of each, this is a team of 40 people - which quite frankly is overkill in any sense of the word. Not to say that an overall team might not be comprised of 40 people once we include all of the infrastructure folks, but not for pure develop-and-test. We can and should make it leaner.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;As an aside, I had the rather disturbing experience, numerous times, of encountering folks who worked these things out with overblown spreadsheets that they normally used for application development estimations. A data warehouse gig is completely different from an application development gig. But of course, if one of these spreadsheet guys ever showed up and plugged in his metrics, he would spout off that we need a 30-person team to migrate a couple of tables from one machine to the new one. Once this number is in the air, it becomes the de-facto standard by which all discussions are measured, even though it is completely wrong. In another setting, another spreadsheet-guy plugged in his numbers and characterized a project as a $900k gig when our competitors were bidding $300k for the same work. Knock yourself out, dude, because the client ain't a-bitin' three times more expensive projects because they like our faces. True to form, the $300k bid actually won. But the irony was, that the potential client had no desire whatsoever to pay more than about $400k, so the bid fit their budget just fine. The eventual winner of the bid took a bath, however. The truth is always somewhere in the middle.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;I still say, watch those spreadsheet-guys. Somethin' up with that.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Perhaps it goes without saying that an architect needs to lay out a framework so that all can work comfortably in the same sandbox. This is a challenge and should not be left to the developers to forge on their own. Harnessing it later will be impossible, because too many opportunities for flow-based consolidation will be lost. Workarounds and repetitive logic will become the rule. Let's not go there.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;If we have say, three solid developers in the back and front end each, they can and should cross-test each other's  work. In this case, we have the senior developers and architect working on the core logic, and the junior developers bench-checking their work and zipping it up for a formal testing team. Here we have a synergy, that a senior developer can crank out ten times the work and quality of the juniors (so says Demarco and Lister, your actual mileage may vary)  but nonetheless, we would not want to put a junior developer in the front seat of this chaim because the testers will be waiting on him more often than not. But with the ten-times-more-power driving the front like a locomotive, the junior developers can wrap up the many tactical areas of the warehouse and cross-test each other, but also receive the work products from the senior developer.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now think about what this kind of model means. The senior developer is literally force-feeding the pipeline with work products and is doing it with the highest quality the team has to offer. The junior developers are learning from the senior without injecting their own inexperience into the mix, which will invariably have to be reviewed by the senior developer anyhow. No, the senior developer is more productive and experienced, so let him/her drive. Seems like every senior developer I talk to, they really, really want to develop and have the testing tedium off their plates. The junior developers really, really want to learn from a senior developer, and of course want to do some development themselves. I'm not saying this is off-limits, but the senior developer can delegate-what-he-knows to the junior developers because they cannot go too far astray without his guidance anyhow, so it's a win-win. And of course, I and every other person who was ever a junior developer had to pay our dues, so not everyone can be the leader. I don't say this dismissively, but we know in a business intelligence project there has to be a driving mind. Too much consensus means too little leadership, and in the famous words of Margaret Thatcher "Consensus is the absence of leadership".&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;But for the people doing the testing, they need something that will scale. To billions of records. And it had better be solid on the first round or they will be playing catch-up for every round after that. While writing and proofreading a document is an eyeballs-only model, don't you think I could at least do myself a favor and run a spell-check and grammar-check on the contents? Such set-based operations resolve a world of problems and let me focus my eyeballs on the harder stuff. But in a data warehouse, our eyeballs will never have enough bandwidth, and will never scale to the necessary heights. Set-based testing is all we have, but it's also all we &lt;em&gt;need&lt;/em&gt;. And with a Netezza machine, we're &lt;em&gt;so&lt;/em&gt; in the zone.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now testing of the report screens can involve eyeball-based activities but doesn't have to be so egregious. Automated testing tools go a long way to mitigate the necessity for eyeballs on these as well (for the subjective parts like positioning, banners or colors especially). However, if the data is wrong, no amount of pretty-pretty will fix it. As Murphy would say "Beauty is only skin deep, but ugly goes to the bone."&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now, no sooner will I write this than I will get feedback from those junior developers who say that they have been relegated, but not to fear. This particular article is in context of a high-productivity bubble of work, normally found with new projects or migrations. The priority is not to make people feel better about their role, but to get past the workload so that everyone can feel better about the work products. I am always looking for opportunities to stretch the developers, both junior and senior. When a junior is ready to sit in the driver's seat of the locomotive, it's because he's passed the Demarco and Lister smoke test. Now what the heck is &lt;em&gt;that&lt;/em&gt;, anyhow?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Get a copy of Demarco and Lister's &lt;em&gt;Peopleware&lt;/em&gt;, a classic in every decade. Something they have empirically measured, is that a junior developer will start out at one level of productivity, and then in a sudden epiphany will transform into someone who is ten times more productive than before. Something mentally and/or emotionally clicks and they get this &lt;em&gt;whoosh&lt;/em&gt;. They claim it is different timing for each developer, but usually takes about two years to make this transition. This is perhaps one reason why so many job-search requirement listings show "X years of experience in Y" and the "X" is never less than two years. Not because the poster has ever read Peopleware, but we who are in the field want folks who are 2 years along because we already know they have (at least) transitioned into a high-productivity asset.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;But this is the mechanism &lt;em&gt;driving&lt;/em&gt; the team makeup - and the experience of the developers and their known levels of productivity should help us find the right role for them on the team. We don't want a low productivity person in the locomotive chair. But having one in the wake of a strong developer only makes them stronger and exposes them to practices that will accelerate their transition into the higher productivity person we always wanted anyhow. And then, of course, once the person has made the 10x transition and is self-aware of their value, we have another problem: They are self-aware of their salary level too! Making someone stronger makes them more valuable. Be prepared to recognize the value (or rest assured that your competitor will). But all this, is the nature of the beast we purport to tame, no?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Back to set-based testing. This has as much to do with using the right data as it does the right method. The right data means - select a subset of known data that will deliberately exercise all of your business rules and software paths. Nothing is worse than realizing such errors in production. Then, we need set-based testing methods. This means we need three primary assets: (a) source data that we can sluice through our application transforms to get a result (b) a saved baseline result to compare this recent result against and (c) tested components that compare these two results in a reliable manner so that we get a statistical report on what passed and what didn't, and a detailed report on what specific records didn't make the cut. Counts, amounts, checksums and summaries all reveal deviations, especially for regression testing. You might recognize this as an exception report, and this is exactly the spirit of the effort. Our testing has to deal with statistical exceptions, because it is the only practical and scalable way to validate billions of rows.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And also notice that such a practice would be the kiss-of-death in many other "secondhand" data warehouse platforms. Such platforms are in no wise optimized to compare monstrous sets of data to each other column-for-column, row-for-row. Queries like that can dim-the-lights and may not return for hours, if not days. We cannot afford a protracted testing phase, and with Netezza we don't have to. Scan times and comparison times are very objective and knowable. The tests will take the same amount of time each time they run, and we always have the option to optimize them further with the Netezza performance model. Power is in the physics.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And again, why all the focus on testing? I have seen data warehouses blind-side an organization that accounted only for the opposite equation - 80 percent development and 20 percent testing, when more often than not, exactly the opposite is true. This would mean that if a two-month development effort were characterized with one model (the wrong one) it would look like at most a three-month effort. Why then does it metastasize into a ten-month effort? Because 20 percent (2 months) tranlates to 80 percent (8 months) of testing.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;That is, if we just embrace the standard model. By embracing the aforementioned model, we get the development out of the way quickly and deliberately, entering the testing phase much sooner, and if more heads are deliberately dedicated to set-based testing we can close this part off even sooner. I have watched very-large-scale projects, with a Netezza team in the middle and strong developers in the locomotive seat, enter their first UAT phase within two months of the project's inception. The funny thing is, the model requires rapid turnaound that only the Netezza workhorse can provide, Try pulling off this team makeup with any other lower-productivity technology, and it won't sing the same key. A high-productivity developer is meaningless on low-productivity technology. And high-productivity testing methods are useless if enslaved to a low-productivity technology.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Start it, shape it, ship it. Netezza is the ticket home.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:a694b939-9ab4-453a-8d4a-ff94f2ce267e] --&gt;</description>
      <pubDate>Sat, 28 Aug 2010 04:28:35 GMT</pubDate>
      <author>dbirmingham</author>
      <guid>http://www.enzeecommunity.com/blogs/grill/2010/08/28/team-makeup</guid>
      <dc:date>2010-08-28T04:28:35Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/grill/comment/team-makeup</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/grill/feeds/comments?blogPost=1169</wfw:commentRss>
    </item>
    <item>
      <title>Talkin' 'bout my generation</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/08/26/talkin-bout-my-generation</link>
      <description>&lt;!-- [DocumentBodyStart:9816c943-cdec-495d-942a-a5aa01b259cf] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;    &lt;!--[if gte mso 9]&gt;&lt;xml&gt; &lt;o:OfficeDocumentSettings&gt;   &lt;o:RelyOnVML &gt;&lt;/o:RelyOnVML&gt;   &lt;o:AllowPNG &gt;&lt;/o:AllowPNG&gt; &lt;/o:OfficeDocumentSettings&gt; &lt;/xml&gt;&lt;![endif]--&gt;   &lt;!--[if gte mso 10]&gt;&lt;style&gt; /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}&lt;/style&gt;&lt;![endif]--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;In a recent blog, Greg Rahn of Oracle responded to Phil’s “&lt;a class="jive-link-external-small" href="http://www.netezza.com/exadata-twinfin-compared/index.aspx"&gt;Oracle Exadata and Netezza TwinFin Compared&lt;/a&gt;” eBook; before commenting on an Oracle engineer’s views, I’ll restate the eBook’s larger themes.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span&gt;Exadata connects Oracle’s RAC database, its architecture designed for online transaction processing (OLTP), via a fast network to a massively parallel processing storage tier. As an OLTP database paired with a specialized storage subsystem, tuning Exadata to function as a data warehouse is complicated and demands skilled, highly trained, experienced technical staff. Mitigating the shortcoming of an OLTP database pressed into service as an analytic database with expensive network and storage makes Exadata costly: to acquire; to design, tune and maintain as an optimally-configured data warehouse; to run in the data center.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span&gt;Netezza TwinFin, designed as an analytic database, brings the power of massively parallel processing to manage and exploit data at terabyte-to-petabyte scale. TwinFin is an appliance–easy to install, easy to operate and easy to manage. TwinFin offers value: fast performance for advanced analytics at an affordable price.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Now I’ll discuss the detail of Greg’s &lt;a class="jive-link-external-small" href="http://structureddata.org/2010/08/10/oracle-exadata-and-netezza-twinfin-compared-%E2%80%93-an-engineer%E2%80%99s-analysis/"&gt;blog&lt;/a&gt; and respond from a Netezza perspective.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: Exadata Smart Scan does not work with index-organized tables or clustered tables.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Greg responds that “&lt;em&gt;IOTs and clustered tables are both structures optimized for fast primary key access, like the type of access in OLTP workloads, not data warehousing&lt;/em&gt;” and suggests our intent was to mislead by quoting from an old Oracle datasheet. It wasn’t. Oracle 11g Release 2 documentation reads “&lt;em&gt;Index-organized tables are suitable for modeling application-specific index structures. For example, content-based information retrieval applications containing text, image and audio data require inverted indexes that can be effectively modeled using index-organized tables.&lt;/em&gt;” Elsewhere the documentation states “&lt;em&gt;Index-organized tables are useful when related pieces of data must be stored together or data must be physical stored in a specific order. This type of table is often used for information retrieval, spatial and OLAP applications&lt;/em&gt;.” In the eBook Phil discusses first and second generation data warehouses; many of the applications described by Oracle as candidates for IOTs are typical of those our customers run on TwinFin – these are second generation data warehouse applications. Greg believes Exadata smart scan not working with index-organized tables has zero impact on Exadata customers. Is it reasonable to conclude that Exadata is not being used for second generation data warehousing?&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: Exadata Smart Scan does not work with the TIMESTAMP datatype.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Since we published the first edition of the eBook Christian Antognini, the original source of this information, goes to the heart of the matter in his &lt;a class="jive-link-external-small" href="http://antognini.ch/2010/08/exadata-storage-server-and-the-query-optimizer-%E2%80%93-part-4/"&gt;blog&lt;/a&gt;: “&lt;em&gt;The essential thing to understand is that this limitation is due to bug 9682721. The fix is expected to be part of 11.2.0.2. According to my test cases (that&lt;/em&gt; &lt;em&gt;Greg Rahn was so kind to execute against an early release of 11.2.0.2), offloading works correctly for all datetime functions but for the following three predicates.&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;ul type="disc"&gt;&lt;li class="MsoNormal" style="line-height: normal;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;em&gt;months_between(d,sysdate) = 0&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="line-height: normal;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;em&gt;months_between(d,current_date) = 0&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="line-height: normal;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;em&gt;months_between(d,to_date(‘01-01-2010’,’DD-MM-YYYY’)) = 0”&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;span style="font-size: 10pt;"&gt;&lt;em&gt;&lt;br/&gt;&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="line-height: normal;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;em&gt;Note that the MONTHS_BETWEEN function can basically be offloaded. The problem in these cases is that the offloading does not work when, for example, SYSDATE is used as a parameter.&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;While happy to let this one pass, I have a question. Do organizations accrue value or cost from a technology requiring its administrators understand all combinations of functions, their predicates and their parameters before they are capable of designing queries to be processed in parallel?&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: When transactions (insert, update, delete) are operating against the data warehouse concurrent with query activity, smart scans are disabled. Dirty buffers turn off smart scan.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;In my opening comments I compared TwinFin’s simplicity to the complexity of Exadata. All queries submitted to TwinFin are processed in its massively parallel grid; no tuning, no special database design. This is appliance simplicity. In Exadata whether a query benefits from smart scans (massively parallel processing) can depend on the state of the data being read. Exadata requires developers to understand at great depth the physical path a query takes to access data. This is complexity.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;While Greg concedes Exadata’s MPP processing is disabled for &lt;span&gt;those blocks containing an active transaction he&lt;/span&gt; is confident that “&lt;em&gt;Not having Smart Scan for small number of blocks will have a negligible impact on performance&lt;/em&gt;”. My experience with Netezza’s customers and their applications prompts me to take a more circumspect view. I’ll explain why in the next section.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: Using [a shared-disk] architecture for a data warehouse platform raises concern that contention for the shared resource imposes limits on the amount of data the database can process and the number of queries it can run concurrently.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Greg argues contention for shared disk is not a problem for Exadata and cites Daniel Abadi’s &lt;a class="jive-link-external-small" href="http://dbmsmusings.blogspot.com/2010/08/defending-oracle-exadata.html"&gt;blog&lt;/a&gt; in his defense. Let’s take a look at what Daniel says on this subject “&lt;em&gt;If you are going to make an argument that shared-disk causes scalability problems, you have to make the argument that contention for the one shared resource in a shared-disk system is high enough to cause a performance bottleneck in the system - namely, you have to argue that the network connection between the servers and the shared-disk is a bottleneck.&lt;/em&gt;” This is the argument Phil makes in our eBook. Consider a query analyzing correlations between equity trades in a sector of a stock market. The algorithm calculates Spearman’s rank correlation coefficient (Spearman’s rho), measuring statistical dependence between two variables by assessing how well the relationship between them can be described. This analysis creates valuable insight in to whether specific equities influence behavior of other equities in the same market sector within a window of one to ten minutes.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;The customer loads a massive volume of trading data into TwinFin and constantly trickle feeds data from live markets into the warehouse. The query is run and re-run constantly to assess behavior of different equities in dynamic markets. Each time TwinFin completes a Cartesian join between all the equities in the sector while at the same time calculating a Volume-Weighted Average Price and a Return From Previous Close value for the equity under investigation. The results pass to Spearman’s rank correlation coefficient function to calculate the Population Covariance and the standard deviation of every equity combination for the time period. Netezza executes every step of the query in parallel utilizing all TwinFin’s hardware and software resources. Netezza’s intelligent storage selects only the rows needed for that market sector and projecting only the columns needed for assessment. The join result is directly streamed to the code implementing the statistical analysis which TwinFin downloads to every processor in its MPP grid, running the complex calculations in parallel. Results from each node in the MPP grid are returned via the network to the host for final assembly and rendering back to the requesting application. TwinFin completes the analysis in a few minutes, and then runs it again, and again for as long as the market is open.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;After several hours Oracle 10G was still attempting to complete its first round of analysis. What difference will a new version of the Oracle database paired with an MPP storage system and a fast network make? Exadata’s MPP storage grid is unable to process Cartesian joins, the first step of in this analytic process, meaning it brings no performance gain but must put all records on the network and send them across to Oracle RAC. Even if it we able to process the join Exadata cannot push down user defined functions, used to implement the calculations, to MPP &lt;span&gt;&lt;/span&gt; - in Oracle functions always execute on the RAC servers. In processing the algorithms Oracle must create and manage temporary data sets and write these out of memory for storage. Exadata’s flash cache may play some role here, but the size of the data sets and the complexity of the algorithms will force database processes to write to disk. This flow from Oracle RAC is back across a network still clogged with coming from the MPP storage tier data, queued and unprocessed waiting for attention from a fully-consumed Oracle RAC. I contend that Exadata’s network connection between the servers and the shared-disk is a bottleneck. Not Exadata’s only bottleneck. TwinFin demonstrates how a true MPP architecture excels in calculating Spearman’s rank correlation coefficient - a real workload on a real dataset. Oracle’s OLTP database, simply not designed to process large-scale analytics, is overwhelmed. Exadata suffers contention on its network and in its database system’s shared disk architecture.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Back to the previous point about Exadata’s MPP processing being disabled for &lt;span&gt;blocks containing an active transaction – the customer is constantly loading new market data and analyzing it in comparison with a massive volume of historic data. While entirely appropriate for transaction processing, Exadata’s architecture of disabling an entire block from parallel processing when a single record in the block is being updated can only hinder and never help in the data warehouse. The very point of a data warehouse is that all data should be available to the business as quickly as extract-transform-load processing allows. By pressing an OLTP database in to service as an analytical database Oracle unnecessarily burdens customers with creating database designs to work around this complexity and, developing a thorough understanding of how each query accesses the data model. While not having Smart Scan for small number of blocks may or may not impact performance, as an unnecessary complexity demanding the attention of database specialists, it costs customers real money.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: Analytical queries, such as “find all shopping baskets sold last month in Washington State, Oregon and California containing product X with product Y and with a total value more than $35” must retrieve much larger data sets, all of which must be moved from storage to database.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Greg shows some nice SQL to demonstrate how Exadata processes the beer and pizza query. Give the business an answer and they always come back with a new question: “Greg, w&lt;em&gt;hat was the total value of Brand #42 beer’ sold in each basket&lt;/em&gt;?” Greg can now update his SQL with the clause:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="margin-left: 0.5in;"&gt;&lt;span style="font-size: 10pt;"&gt;sum(case when p.product_description in ('Brand #42 beer') then td.sales_dollar_amt else 0 end) sum_productX,&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;and re-run the query. Business users love IT when we give them a fast performing system but are less forgiving when a query, that yesterday ran blazingly fast, today slows to a snail’s pace. Exadata cannot push down the newly introduced sum for parallel processing by its storage nodes as the join must be processed first, and the storage nodes cannot process joins. Any function or calculation that uses columns from two or more tables must be evaluated on the RAC database servers. The query performance is going to degrade significantly sending the database expert back to the Oracle documentation in an attempt to find a new way to resolve the amended query so it completes at a time acceptable to the business.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Claim: To evenly distribute data across Exadata’s grid of storage servers requires administrators trained and experienced in designing, managing and maintaining complex partitions, files, tablespaces, indices, tables and block/extent sizes.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;While conceding Oracle Automatic Storage Management automates the task of striping partitions across all available disks, the ASM administration team must still create partitions, configure and manage disk groups for shared storage across instances, choose and implement either 2-way mirroring or 3-way mirroring, and configure Allocation Unit sizes. Additionally, Exadata configuration requires administrators create and manage tablespaces, index spaces, temp spaces, logs and extents.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;In conclusion, Netezza entered the data warehouse market convinced the products offered by the dominant vendors, in particular Oracle, were ill-suited to meet the challengers of Big Data and of such complexity to make them exorbitantly expensive to acquire and use. Exadata only increases the complexity and expense of an Oracle warehouse. Greg draws his readers’ attention to the excellent blog at &lt;a class="jive-link-external-small" href="http://dbmsmusings.blogspot.com/"&gt;http://dbmsmusings.blogspot.com/&lt;/a&gt; where Daniel Abadi muses “&lt;em&gt;Both Oracle and Teradata are too expensive for large parts of the analytical database market.&lt;/em&gt;”&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;Greg’s blog reveals one path available to organizations wishing to generate greater value from their data. CIOs willing to build, train, and permanently assign a team of technical experts to choosing just the right combination from a myriad of settings, can be continuously employed coercing a database designed for OLTP to function as a data warehouse. I’ll close this blog with a manager’s perspective, from someone who focuses an organization’s limited resources on its highest priorities. Peter Drucker, who introduced us to the concept of the knowledge worker, gave us a pragmatic measure to evaluate our own and our team members’ activity - am I merely efficient (doing things right) or truly effective (doing the right thing)? All the workarounds and clever tuning demanded by Exadata simply don’t exist in TwinFin, Netezza has proven them unnecessary.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:9816c943-cdec-495d-942a-a5aa01b259cf] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_intelligence</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">exadata</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oracle_database_machine</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">price-performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">daniel_abadi</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oltp</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oracle</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">christian_antognini</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">advanced_analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">greg_rahn</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">twinfin</category>
      <pubDate>Thu, 26 Aug 2010 20:32:49 GMT</pubDate>
      <author>Mick</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/08/26/talkin-bout-my-generation</guid>
      <dc:date>2010-08-26T20:32:49Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/talkin-bout-my-generation</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1168</wfw:commentRss>
    </item>
    <item>
      <title>Getting SQLException ,I Expect BatchUpdate Exception</title>
      <link>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/23/getting-sqlexception-i-expect-batchupdate-exception</link>
      <description>&lt;!-- [DocumentBodyStart:476f3ba9-8c0f-4bed-99f7-aa9ddf0b6e16] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p style="text-align: left;"&gt;Hi,&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="text-align: left;"&gt;  following is my code, I am using netezza JDBC driver 5.0.10,this  driver support BATCH UPDATES, so with the following code i should get BatchUpdateException but I am getting SQLException would any  one please help me.I am first time working with Netezza&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;                             &lt;/span&gt;&lt;/span&gt; &lt;span&gt;&lt;/span&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;&lt;span style="font-family: Courier New; color: #7f0055; font-size: 10pt;"&gt;                             &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                      try{&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                                     for(String dBstat: DbStatments)&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                                    {&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                                            statement.addBatch(dBstat);&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                                    }&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                                   noOfRecordsUpdated=statement.executeBatch();&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                              &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size: 10pt;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                         }&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                         catch(BatchUpdateException  bue)&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                        {&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                               logger.error(bue.getMessage());&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                        }&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                       catch(SQLException sql)&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                       {&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                               logger.error(sql.exception);&lt;/p&gt;&lt;p style="text-align: left;"&gt;                                                      }&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="text-align: center;"&gt;&lt;span style="color: #0000c0; font-size: 10pt;"&gt;&lt;span style="color: #0000c0; font-size: 10pt;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size: 10pt;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:476f3ba9-8c0f-4bed-99f7-aa9ddf0b6e16] --&gt;</description>
      <pubDate>Mon, 23 Aug 2010 17:12:31 GMT</pubDate>
      <author>vgss</author>
      <guid>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/23/getting-sqlexception-i-expect-batchupdate-exception</guid>
      <dc:date>2010-08-23T17:12:31Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/groups/enzee-universe/blog/comment/getting-sqlexception-i-expect-batchupdate-exception</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/groups/enzee-universe/blog/feeds/comments?blogPost=1167</wfw:commentRss>
    </item>
    <item>
      <title>Thinking about Right-Time Analytics in the Big Data Era at TDWI</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/08/19/thinking-about-right-time-analytics-in-the-big-data-era-at-tdwi</link>
      <description>&lt;!-- [DocumentBodyStart:7806e4e4-93eb-4ec3-89c8-84bf402b5187] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;&lt;em&gt;&lt;strong&gt;Netezza Director of Product Marketing Razi Raziuddin is blogging today.&lt;br/&gt;&lt;/strong&gt;&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;      &lt;!--[if gte mso 9]&gt;&lt;xml&gt; &lt;o:OfficeDocumentSettings&gt; &lt;o:AllowPNG &gt;&lt;/o:AllowPNG&gt; &lt;/o:OfficeDocumentSettings&gt; &lt;/xml&gt;&lt;![endif]--&gt; &lt;!--[if gte mso 10]&gt;&lt;style&gt; /* Style Definitions */table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}&lt;/style&gt;&lt;![endif]--&gt; &lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;I’ve been at The &lt;a class="jive-link-external-small" href="http://events.tdwi.org/events/san-diego-world-conference-2010/home.aspx"&gt;2010 TDWI World Conference&lt;/a&gt; in San Diego this week, where the theme is "agile BI that delivers data (I would use the term ‘insights’) at the speed of thought.” &lt;span&gt;&lt;/span&gt;Timing is everything when it comes to making decisions – and influencing other to make decisions we’d like to see.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;We’ve all experienced &lt;strong&gt;Red Car Syndrome&lt;/strong&gt; at some point or another. You test drive a red car. You like it. Suddenly, you start noticing red cars everywhere – not because the number of red cars has increased, but because the experience of driving a red car is now personalized. Online advertisers use Red Car Syndrome to connect consumers with the products they genuinely want, as I was reminded first-hand recently. &lt;span&gt;&lt;/span&gt;While searching for kitchen fixtures online, I noticed that many of the ads featured a pair of pricey fixtures that initially caught our eye, but that we had rejected as exceeding our budget. &lt;span&gt;&lt;/span&gt; But the ads seemed to know our tastes better than we did, and ultimately we succumbed and made the purchase.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1166-1623/Red-Car-psd38311+6.jpg"&gt;&lt;img align="left" alt="Red-Car-psd38311 6.jpg" class="jive-image" height="90" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1166-1623/213-90/Red-Car-psd38311+6.jpg" width="213"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;The experience brought home the power of right-time analytics. Speed is critical in making analytics actionable and delivering real value to the business. The trifecta of huge data volumes, complex analytics and query performance is an increasingly common thread in the BI and data warehousing world. It is true not just for online marketers, but cuts across industry lines. Whether it is an insurance provider trying to prevent fraud, a telco determining the cheapest and best path to route a call or a government agency unearthing criminal activity, time to insight from big data makes the difference in every case.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;Doug Henschen recently wrote a &lt;a class="jive-link-external-small" href="http://analytics.informationweek.com/issue/982/informationweek-full-issue-august-9-2010.html"&gt;good article on this topic for InformationWeek&lt;/a&gt; in which he calls out success in the Big Data era as the ability to get faster insights from huge data sets. The article highlights &lt;a class="jive-link-external-small" href="http://www.netezza.com/videos/catalina.aspx"&gt;Catalina Marketing’s  petascale data warehouse environment&lt;/a&gt; and the fast insights they derive from a huge database of 195 million consumers.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;Although not every enterprise has a data warehouse environment quite that large, the need to perform complex analytics and derive insight in the shortest time possible is common in every environment, big or small. While scalable MPP architectures address the big data problem quite well, the big math problem associated with complex and advanced analytics is what many customers still wrestle with. There’s general agreement that in-database processing, especially in scalable MPP systems, is the right solution to the big math problem. &lt;a class="jive-link-external-small" href="http://analytics.informationweek.com/issue/982/informationweek-full-issue-august-9-2010.html"&gt;Doug’s article again highlights Catalina’s use of in-database analytics&lt;/a&gt; to radically streamline their analytic modeling environment and gain efficiencies of 10X as a result.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;However, not every data warehouse platform is geared up for the challenges of performing in-database analytics at scale. The first and obvious challenge is the additional processing overhead required to run advanced analytic algorithms alongside the traditional data warehouse workload. You need a system architecture that is not overwhelmed by the data volumes typical of data warehouses in the Big Data era. Then there is the question of what analytics you want to perform. The majority of commonly available analytic libraries are written for in-memory processing in SMP systems and need to be parallelized in order to take advantage of MPP architectures. The analytic system should not only offer parallelized versions of the analytics you desire, but also provide primitives to easily parallelize advanced analytic algorithms while hiding the complexity of parallel programming from developers.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;Finally, the dearth of universally accepted standards in the advanced analytics world poses yet another challenge. A typical analytic environment may consist of a mish-mash of commercially available tools such as SAS and SPSS, open source ones such as R and Hadoop (which are gaining popularity), and tons of application code written in various languages such as Java and Python. The underlying system must offer tremendous flexibility in integrating with a wide array of analytic tools and support for a variety of frameworks and languages.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 8pt;"&gt;In subsequent posts, I’ll talk about Netezza’s advanced analytic capabilities to enable big math on big data. In the meantime, as you plan your analytic infrastructures for the Big Data era, tell us what challenges you are coming up against.&lt;/span&gt;&lt;/p&gt;&lt;!--EndFragment--&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:7806e4e4-93eb-4ec3-89c8-84bf402b5187] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">advanced_analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">bi</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <pubDate>Thu, 19 Aug 2010 18:31:22 GMT</pubDate>
      <author>razi</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/08/19/thinking-about-right-time-analytics-in-the-big-data-era-at-tdwi</guid>
      <dc:date>2010-08-19T18:31:22Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/thinking-about-right-time-analytics-in-the-big-data-era-at-tdwi</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1166</wfw:commentRss>
    </item>
    <item>
      <title>Access Netezza using JDBC</title>
      <link>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/13/access-netezza-using-jdbc</link>
      <description>&lt;!-- [DocumentBodyStart:93444c6a-0639-427d-8a32-6ce47a6e7a28] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;Hi,&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;  would any one please tell me how to get the list of SQLErrorCODES netezza  will return&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Thx&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:93444c6a-0639-427d-8a32-6ce47a6e7a28] --&gt;</description>
      <pubDate>Fri, 13 Aug 2010 13:20:19 GMT</pubDate>
      <author>vgss</author>
      <guid>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/13/access-netezza-using-jdbc</guid>
      <dc:date>2010-08-13T13:20:19Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/groups/enzee-universe/blog/comment/access-netezza-using-jdbc</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/groups/enzee-universe/blog/feeds/comments?blogPost=1165</wfw:commentRss>
    </item>
    <item>
      <title>Netezza SQL ERROR Codes</title>
      <link>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/10/netezza-sql-error-codes</link>
      <description>&lt;!-- [DocumentBodyStart:a94d3303-3379-4be6-be4f-604d1691bfeb] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;Hi,&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;  I am trying to access Netezza from my JDBC code, would any one please tell me as where can I look for  NETEZZA SQL ERROR CODES&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;THX&lt;/p&gt;&lt;p&gt;vgss&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:a94d3303-3379-4be6-be4f-604d1691bfeb] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">error</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">codes</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">sql</category>
      <pubDate>Tue, 10 Aug 2010 19:13:58 GMT</pubDate>
      <author>vgss</author>
      <guid>http://www.enzeecommunity.com/groups/enzee-universe/blog/2010/08/10/netezza-sql-error-codes</guid>
      <dc:date>2010-08-10T19:13:58Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/groups/enzee-universe/blog/comment/netezza-sql-error-codes</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/groups/enzee-universe/blog/feeds/comments?blogPost=1164</wfw:commentRss>
    </item>
    <item>
      <title>A Light Touch for Data Warehouse Appliance Customer Support</title>
      <link>http://www.enzeecommunity.com/blogs/daiclegg/2010/08/04/a-light-touch-for-data-warehouse-appliance-customer-support</link>
      <description>&lt;!-- [DocumentBodyStart:308ac8ef-ca4c-41d7-b586-ccf0ebf5d662] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;&lt;span style="font-family: Cambria; color: #365f91;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: Cambria; color: #365f91;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;I’ve been at Netezza now for just over three months and I still feel a bit like the innocent abroad but at the same time, I’m clearly a veteran, as other people join our growing team after me. We have our own EMEA Telco Solutions Manager (Chris Smith) and EMEA Alliances Manager (Kate Tickner). I’m just jealous they seem to be on top of their missions, much more quickly than me. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;It’s great that we have more dedicated resources for working with partners and customers, but i was struck last week by some other the stuff we do, working with customers, that is really much less visible.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;Over every weekend I get cc’d on a bunch of weekly reports from folk in the org, which I scan, occasionally raise a question about and file. And amongst the reports are those from our technical account managers. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;These reports list customer accounts, each with a status - green, amber and red (not many of these – no really), the activities in the account this week, issues outstanding and what’s happening next. All sensible stuff and my input is sometimes to ask for a briefing if an account is at amber for two weeks or more, which generally they aren’t, and at red even more infrequently (but not never – get thee behind me corporate).&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;Then i rewalized that what I’m seeing is every single customer having their account reviewed every single week – proactively. I even read a report the other week in which a customer had requested less frequent contact; a sort of “everything is fine, don’t call me, i’ll call you”. And I was stunned. This is a customer complaining in that their supplier’s customer service is too responsive! Now that’s not a situation you meet often in my experience. Yet Netezza regard that level of customer service as standard process. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;This was the next report for that customer:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt 36pt;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191; mso-bidi-font-weight: bold;"&gt;Customer:&lt;/span&gt;&lt;strong&gt;                                     &lt;/strong&gt;   &lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191; mso-bidi-font-weight: bold;"&gt;XXX&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt 36pt;"&gt;&lt;span style="font-family: Calibri;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191; mso-bidi-font-weight: bold;"&gt;Status:&lt;/span&gt;&lt;span style="mso-bidi-font-weight: bold;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="mso-tab-count: 1;"&gt;&lt;/span&gt; &lt;strong&gt;           &lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="color: #9bbb59; font-size: 12pt; mso-bidi-font-weight: bold;"&gt;&lt;span style="mso-tab-count: 2;"&gt;                              &lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;strong&gt;Green&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt 36pt;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191;"&gt;&lt;span style="font-family: Calibri;"&gt;Activity this week:&lt;span style="mso-tab-count: 2;"&gt;                         &lt;/span&gt; No activity this week. Model customer&lt;/span&gt;&lt;/span&gt; &lt;span style="font-family: Wingdings; color: #365f91; mso-themecolor: accent1; mso-themeshade: 191; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-char-type: symbol; mso-symbol-font-family: Wingdings;"&gt;&lt;span style="mso-char-type: symbol; mso-symbol-font-family: Wingdings;"&gt;J&lt;/span&gt;&lt;/span&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt 36pt;"&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;Issues on hold/carried forward:&lt;span style="mso-tab-count: 1;"&gt;    &lt;/span&gt; None&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt 36pt;"&gt;&lt;span style="color: #365f91; mso-themecolor: accent1; mso-themeshade: 191;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;Next steps:&lt;span style="mso-tab-count: 3;"&gt;                                    &lt;/span&gt; Light touch.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 0pt;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;I just loved that ’Light Touch’.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:308ac8ef-ca4c-41d7-b586-ccf0ebf5d662] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">customer-service</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data-warehouse</category>
      <pubDate>Wed, 04 Aug 2010 21:08:10 GMT</pubDate>
      <author>dclegg@netezza.com</author>
      <guid>http://www.enzeecommunity.com/blogs/daiclegg/2010/08/04/a-light-touch-for-data-warehouse-appliance-customer-support</guid>
      <dc:date>2010-08-04T21:08:10Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/daiclegg/comment/a-light-touch-for-data-warehouse-appliance-customer-support</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/daiclegg/feeds/comments?blogPost=1163</wfw:commentRss>
    </item>
    <item>
      <title>Four Fundamental Differences Between TwinFin and Exadata</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/08/04/four-fundamental-differences-between-twinfin-and-exadata</link>
      <description>&lt;!-- [DocumentBodyStart:f82c4049-7c9b-4b16-bc9a-89245ba21b32] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p style="padding-left: 30px;"&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: left;"&gt;&lt;span&gt;Today Netezza is launching a new eBook entitled, “&lt;/span&gt;&lt;a class="jive-link-external-small" href="http://www.netezza.com/exadata-twinfin-compared"&gt;Oracle Exadata and Netezza TwinFin™ Compared&lt;/a&gt;&lt;span&gt;”. As the name implies, this eBook provides a comparison of the Netezza TwinFin data warehouse appliance and Oracle’s “appliance-like” database machine offering.&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1162-1610/ebook_tfexam_thumb.jpg"&gt;&lt;img alt="ebook_tfexam_thumb.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1162-1610/ebook_tfexam_thumb.jpg" style="float: right;"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: right;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Certainly Netezza is not the first company to compare/contrast its flagship system with Oracle’s most recent entry. Richard Burns, a consultant over at Teradata did a laudable job exposing the technical shortcomings of the Exadata v2 machine as they pertain to data warehousing in a&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://bit.ly/9doIBM"&gt;May 2010 whitepaper&lt;/a&gt;&lt;span&gt;. And there have been&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://bit.ly/98oyyI"&gt;several&lt;/a&gt; &lt;span&gt;&lt;/span&gt;&lt;a class="jive-link-external-small" href="http://bit.ly/aMuBjN"&gt;recent&lt;/a&gt; &lt;span&gt;&lt;/span&gt;&lt;a class="jive-link-external-small" href="http://bit.ly/cdsZ5J"&gt;pieces&lt;/a&gt; &lt;span&gt;written on Oracle’s apparent success although the&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.dbms2.com/2010/07/14/exadata-reference-accounts/"&gt;publicly named customer-list has struck some as a bit underwhelming&lt;/a&gt;&lt;span&gt;.&lt;/span&gt; &lt;img height="16px" src="http://www.netezzacommunity.com/images/emoticons/wink.gif" width="16px"/&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;N&lt;/span&gt;etezza continues to compete (and win) against Oracle regularly in the marketplace, including in competition with the Exadata v2 product and so, we felt it was high time to put our own comparison story together with today’s eBook and with this little blog posting. Let me know what you think.&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;So where to begin? Let’s start with the fact that the Netezza TwinFin is built to excel at a specific purpose – as the best price/performance platform for Data Warehousing and Analytics in the market. Conversely, Oracle has tried to “kill two birds with one stone” in the Exadata v2 – aiming it &lt;strong&gt;primarily&lt;/strong&gt; at the On-Line Transaction Processing applications space, but also making bold claims to performance as a Data Warehouse with it’s Sun-based Oracle Database Machine (DBM) and Exadata Storage Server, version 2 (Exadata).&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;So why does it matter that Oracle is aiming to do both OLTP and DW in the same system – &lt;em&gt;apart, that is, from at least two decades of people trying-and-failing to do exactly that with the likes of Oracle in previous software and hardware instantiations&lt;/em&gt;? Let’s start with the workload requirements of the two application areas:&lt;/span&gt;&lt;/p&gt;&lt;ul style="padding-left: 30px;"&gt;&lt;li&gt;OLTP systems execute many short transactions, typically of extremely small scope (touching only a handful of records) and in extremely predictable, well-understood access and query patterns. They need to excel at handling these small transactions in very high volume, combined with equally small writes to the database in the form of updates, insertions and deletions. This limited scope, high throughput and “regularity” of the access patterns make OLTP systems great candidates for intelligent caching and (multiple) secondary data structures, such as indices to speed their processing.&lt;/li&gt;&lt;/ul&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;ul style="padding-left: 30px;"&gt;&lt;li&gt;Conversely, DW systems are typically asked to perform “read-heavy” queries and operations against the current and deep historical data sets. Rather than analyzing just a few records, a DW query might look at millions, even billions, of rows from a single table, combined with join logic with multiple other tables. Data warehouse systems are used by company analysts and managers to find the “needle in the haystack” in guiding enterprise decision-making in a more comprehensive and often &lt;em&gt;ad-hoc&lt;/em&gt; manner – frequently mitigating the ability to use “tricks of the trade” such as results caching and/or indices.&lt;/li&gt;&lt;/ul&gt;&lt;p class="MsoNormal" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;So the two applications tend to lead to very different system/platform implications. No special “news” there – as I said earlier, people have been trying-and-failing to use a single system for both applications for years.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Without stealing any more of the thunder of our electronic publication today, let me just lay out what I believe are the fundamental differences between Netezza’s TwinFin and the Oracle Database Machine/Exadata as simply and plainly as I can:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;table border="2" cellpadding="3" cellspacing="0" style="width: 66%; text-align: left; border: 2px solid #7fc738;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th align="center" style="background-color: #6690bc;" valign="middle"&gt;&lt;span style="color: #ffffff;"&gt;&lt;strong&gt;Netezza TwinFin&lt;/strong&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="center" style="background-color: #6690bc;" valign="middle"&gt;&lt;span style="color: #ffffff;"&gt;&lt;strong&gt;Oracle Database Machine / Exadata v2&lt;/strong&gt;&lt;/span&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; "&gt;True MPP&lt;/td&gt;&lt;td style="text-align: center; "&gt;Hybrid "SMP-plus" Approach&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; "&gt;Data Streaming with a Hardware Assist&lt;/td&gt;&lt;td style="text-align: center; "&gt;CPU-intensive Processing for Basic DB Operations&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; "&gt;Deep Analytics Processing&lt;/td&gt;&lt;td style="text-align: center; "&gt;Central Cluster-based Approach&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; "&gt;No-Tuning-Required Simplicity&lt;/td&gt;&lt;td style="text-align: center; "&gt;&lt;span&gt;Complex Array of Knobs and Levers&lt;/span&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="MsoListParagraphCxSpFirst" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpFirst"&gt;In my view, these are "big deal" differences. They're not the result of a simple &lt;em&gt;feature gap&lt;/em&gt; to be closed in an upcoming point-release, but rather go directly to limitations at the heart of the Oracle DBM/Exadata system architecture and/or business culture. To address them would require a major rearchitecting, or at least refactoring, of Oracle's decades-old DBMS code base. They also happen to be &lt;em&gt;highly visible to customers and prospects,&lt;/em&gt; which makes for some interesting comparisons in head-to-head on-site Proofs of Concept (POCs).&lt;/p&gt;&lt;p class="MsoListParagraphCxSpFirst" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpFirst"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;strong&gt;1)&lt;/strong&gt; &lt;strong&gt;True MPP vs. a Hybrid "SMP-plus" Approach&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;Netezza’s TwinFin uses a full MPP approach to data warehousing, pushing &lt;strong&gt;all&lt;/strong&gt; of the processing down as close as possible to where the data is stored and maximizing the processing horsepower of MPP for scalability, throughput and performance – for even the most complex workloads. Using the MPP method of dividing the workload and attacking query problems in parallel, Netezza has been able to demonstrate market-leading data warehouse price-performance across four generations of data warehouse appliances.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;Oracle’s DBM/Exadata takes a hybrid approach adding Exadata Storage nodes largely to handle data decompression and predicate filtering tasks, but still relying primarily on the SMP cluster of Oracle RAC to handle most of the data warehouse tasks, including complex joins. In addition the SMP cluster also must act as the central distribution point for any data that needs to be redistributed between and across Exadata nodes. To try to minimize this, Oracle and Sun’s solution was to “&lt;/span&gt;&lt;em&gt;throw hardware at the problem&lt;/em&gt;&lt;span&gt;” (quoting&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://bit.ly/9doIBM"&gt;Teradata’s Mr. Burns&lt;/a&gt;&lt;span&gt;), over-engineering interconnections, processor rates and other elements required because of all of this data movement, rather than refactoring and solving a fundamental software architecture issue.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;The difference between the two is akin to an 8-lane continuous streaming superhighway in the TwinFin instance versus multiple freeways converging on and necking down to a two-lane country road via a “traffic roundabout”. I live in Massachusetts and can attest to the negative impact of taking multiple highways down to a single road – it happens every weekend at the gateway to and from Route 6 on Cape Cod.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;strong&gt;2) Data Streaming with a Hardware Assist vs. CPU-intensive Work for Basic DB Operations&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;In addition to the advantages of the MPP architecture for data warehousing, the TwinFin system makes use of hardware acceleration for increased query and analytics performance. Coming in the form of the "&lt;a class="jive-link-external-small" href="/blogs/07"&gt;DB Accelerator&lt;/a&gt;" that is part of each S-Blade in the TwinFin system architecture, providing four dual-core Field-Programmable Gate Arrays (FPGAs) on each DB Accelerator, this hardware acceleration takes care of fundamental processing steps such as decompression, predicate filtering and ACID-compliant data visibility at the full scan rate of the data from disk. The fact that this device is placed as close as it is to the disks for which it is performing its processing gives the TwinFin system much more performance leverage because data can be filtered, processed and value-added before undergoing any unnecessary CPU processing or having to be transported across an expensive network.&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;And the fact that it is a field programmable device means that Netezza can use it to introduce additional features and performance through a simple upgrade to our NPS software/firmware – as Netezza has with the introduction of two phases of &lt;a class="jive-link-external-small" href="http://bit.ly/amSYy5"&gt;hybrid column/row-level compression technology&lt;/a&gt; (with Release 6.0, scaling as high as 32:1 compression, depending on data patterns) first introduced in 2005, and our high-performance implementation of row-level security. Because it's performed in the FPGA in TwinFin, "&lt;em&gt;Compression = Performance&lt;/em&gt;"; so if a customer's data is compressed by a 4:1 factor, the effective data streaming rate for processing queries is increased four-fold.&lt;/p&gt;&lt;!--StartFragment--&gt;&lt;!--EndFragment--&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;Conversely, the DBM/Exadata system relies entirely on CPU processing. In fact, the great majority of the functionality provided for by the Exadata nodes in the DBM/Exadata system is to replicate the functionality included in each FPGA core of the TwinFin - data decompression and predicate filtering. Because of the CPU-intensive nature of decompressing data in the DBM/Exadata system, Oracle "strongly suggests" lesser compression when data is required for high-performance data warehousing vs. "cooler" queryable archive purposes. Again, the heavy-lifting for query processing and analytics is left to the central SMP cluster nodes rather than parallel Exadata nodes, forcing Oracle to "throw hardware at the problem".&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;strong&gt;3)&lt;/strong&gt; &lt;strong&gt;Deep Analytics Processing vs. Central Cluster Analytics&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;Netezza brings analytics to where the data is stored – as close as possible to where it is stored to do the processing&lt;/span&gt; – not &lt;span&gt;just to decompress it and do predicate filtering, but to complete as much of the complex analytics as is possible, &lt;span style="text-decoration: underline;"&gt;in parallel&lt;/span&gt;. That’s as true of the “traditional” OLAP analytics of SQL-based data warehousing as it is of the advanced and predictive analytics enabled by the new capabilities of i-Class in the “&lt;/span&gt;&lt;a class="jive-link-external-small" href="http://www.netezza.com/releases/2010/release062110_4.htm"&gt;Second Wave of TwinFin&lt;/a&gt;&lt;span&gt;”.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;With i-Class, Netezza introduces a comprehensive, scalable and high-performance approach to advanced analytics for both our customers and partners, spanning Linear Algebra/Matrix manipulation, and engines for R and Hadoop along with several programming languages including C, C++, Java, Python and even Fortran. The i-Class functionality also offers plug-ins and packages for the Eclipse IDE and R GUI, and pre-built, analytic functions engineered to deliver performance at scale spanning data preparation, mining, predictive analytics and spatial functions together with access to analytics functions from the GNU Scientific Library and R CRAN repository.&lt;/span&gt; Extended by the i-Class embedded analytics capabilities, TwinFin allows our partners and customers to push-down applications, functions and algorithms going well beyond standard set-based SQL, at scale with high performance, freeing them of the latency and sampling requirements demanded by off-board processing platforms for advanced analytics.&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;The Oracle DBM/Exadata performs the majority of the OLAP analytics in the central cluster (RAC) nodes, after traversing the "traffic roundabout". And apart from basic scoring functionality, virtually ALL of the advanced analytics are performed in the cluster nodes as well. Placing the predominance of processing in the central SMP cluster means that both the functionality and scale of the analytics are limited by the capacity and performance that the SMP cluster can provide - typically limited to the elements included in Oracle's own "&lt;/span&gt;&lt;em&gt;Data Mining&lt;/em&gt;&lt;span&gt;" package.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;The DBM/Exadata’s requirement for shipping the data from the storage arrays to the central cluster for analytics is akin to backhauling full massive truckloads of materials from a mining site to pick out the gold at a central headquarters rather than sifting out the most important nuggets in parallel and sending only those valuable elements back in the case if TwinFin.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;strong&gt;4)&lt;/strong&gt; &lt;strong&gt;No-Tuning-Required Simplicity vs. a Complex Array of Knobs and Levers&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;For a long time, the simplicity of the Netezza data warehouse appliance has shone through most strongly in the extremely limited tuning requirements it imposes on administrators of the system, particularly as compared to Oracle-based systems. Simplifying the system management is core to Netezza’s “&lt;/span&gt;&lt;em&gt;appliantization&lt;/em&gt;&lt;span&gt;” of the data warehouse and analytics platform. Rather than managing a “coordinated collection” of technology assets, the system and database administrators of TwinFin interact with a single appliance and use the redundant Linux-based SMP host nodes as the interaction point for all activities. Everything from database configuration, data distribution, data mirroring, monitoring, software upgrade and day-to-day management are simplified (in the words of one TwinFin customer, “&lt;/span&gt;&lt;em&gt;It’s Netezza-easy – it just works.&lt;/em&gt;&lt;span&gt;”).&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;No indexing is necessary (or even supported) in TwinFin to achieve high performance. Just about the only requisite “tuning” of the system is the definition of the distribution key for spreading data across all the S-Blades – typically the primary keys of the tables. Even in the internal management structure of TwinFin, our system management has been configured to get the maximum performance from the commodity subsystems (blades, chassis, disk arrays and network) by connecting them in novel ways and then managing them at a &lt;strong&gt;system level&lt;/strong&gt;, rather than at the subsystem or rack-level.&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;While it is true that Oracle has simplified some of the tuning knobs and levers in the DBM/Exadata, prospective customers should ask them if they really have moved into the domain of requiring only a small handful of tuning knobs &amp;amp; settings; or whether they still require, or more colloquially, “&lt;em&gt;strongly suggest&lt;/em&gt;” the use of dozens or even hundreds of settings (depending upon the number of objects being maintained and optimized). How many dozens of IP addresses are needed to configure and manage the DBM/Exadata (TwinFin requires&lt;/span&gt; only two)? Oracle even have &lt;a class="jive-link-external-small" href="http://bit.ly/cJY7Iu"&gt;a special service&lt;/a&gt; to help DBM/Exadata customers migrate and &lt;strong&gt;tune&lt;/strong&gt; their systems and databases for performance and some of their leading Performance Architects even talk about the requirement of using functions like the Oracle SQL Tuning Advisor as an inevitable &lt;em&gt;&lt;a class="jive-link-external-small" href="http://bit.ly/bGWaFs"&gt;fait accompli&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="padding-left: 30px;"&gt;&lt;span&gt;By Oracle’s own admission, the time-savings that customers can expect to achieve in managing and tuning the DBM/Exadata system in Oracle 11g r2 is&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://bit.ly/ddRd9i"&gt;only 26% less than in Oracle 11g.&lt;/a&gt; &lt;span&gt;Contrast that with installation after installation of Netezza appliances where 100s of terabytes of data under management in a data warehouse(s) are being maintained by two or even less then one FTE, rather than a team of Oracle specialists. It all depends on one’s perspective and philosophy in building a real appliance for the data warehouse market. Where others may see the need to tune, partition, index and sub-index data sets for performance purposes as an inevitability, Netezza sees that same need as reason to enhance TwinFin’s capabilities in order to obviate it.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoListParagraphCxSpMiddle"&gt;&lt;span&gt;All of this really adds up quickly to a significant price-performance advantage for customers of TwinFin – and with our limited tuning and simplified operations, also translates into much more rapid time-to-value for Netezza’s customers, too.&lt;/span&gt; &lt;span&gt;So that’s it – four simple fundamental differences that really set the TwinFin appliance apart from the DBM/Exadata.&lt;/span&gt; &lt;strong&gt;Agree? Disagree? Let me know what you’re thinking.&lt;/strong&gt; &lt;span&gt;And now, go over and&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/exadata-twinfin-compared"&gt;have a look at today’s eBook release&lt;/a&gt; &lt;span&gt;for the rest of the story.&lt;/span&gt;&lt;/p&gt;&lt;!--EndFragment--&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:f82c4049-7c9b-4b16-bc9a-89245ba21b32] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oracle</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">exadata</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">database_machine</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse_appliance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">teradata</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">dw</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">price-performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">i-class</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">olap</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oltp</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">mpp</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">twinfin</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">dbm</category>
      <pubDate>Mon, 02 Aug 2010 23:56:30 GMT</pubDate>
      <author>pfrancisco</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/08/04/four-fundamental-differences-between-twinfin-and-exadata</guid>
      <dc:date>2010-08-02T23:56:30Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/four-fundamental-differences-between-twinfin-and-exadata</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1162</wfw:commentRss>
    </item>
    <item>
      <title>Netezza and Data Analytics as a Gateway to the Intelligent Economy</title>
      <link>http://www.enzeecommunity.com/blogs/ceoblog/2010/08/03/netezza-and-data-analytics-as-a-gateway-to-the-intelligent-economy</link>
      <description>&lt;!-- [DocumentBodyStart:592214aa-1711-4def-bac5-756b95936603] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;      &lt;!--[if gte mso 9]&gt;&lt;xml&gt; &lt;o:OfficeDocumentSettings&gt; &lt;o:AllowPNG &gt;&lt;/o:AllowPNG&gt; &lt;/o:OfficeDocumentSettings&gt; &lt;/xml&gt;&lt;![endif]--&gt; &lt;!--[if gte mso 10]&gt;&lt;style&gt; /* Style Definitions */table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;}&lt;/style&gt;&lt;![endif]--&gt; &lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;The "Intelligent Economy" is much more than a trendy buzz phrase or the name of a business school seminar.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;It's the new reality for enterprises today.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;The Intelligent Economy is about fundamentally changing the way we view our businesses and our customers, about inviting change when we decide change is needed – but also about recognizing what we do that makes sense and valuing it accordingly.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;A June, 2010 IDC Vendor Spotlight noted that the “demand to respond faster and with greater insight to ongoing internal and external events based on facts is increasing,” and that “data management and analytics challenges of the intelligent economy are likely to overwhelm organizations unprepared for the emerging changes.” (You can &lt;a class="jive-link-external-small" href="http://www.netezza.com/documents/whitepapers/idc_enabling_intelligent_economy.pdf"&gt;download the PDF of the IDC report&lt;/a&gt; here.)&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;Gut checks and intuition are still valuable when it comes to making management decisions – but actionable, on-demand information from analytics is shaping up to be a key differentiator when it comes to competing these days.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;strong&gt;ENTERPRISES MAKE UP THE INTELLIGENT ECONOMY&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;The Intelligent Economy is made up of "Intelligent Enterprises" – companies and organizations that not only recognize the new reality – but are thriving because of it.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;For these enterprises, data isn't viewed as a problem that must be managed and contained. For them, data is a gift that exposes hidden truths about their futures.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;We know at Netezza – &lt;a class="jive-link-external-small" href="http://www.netezza.com/customers/index.aspx"&gt;based on the real-life experiences of our customers&lt;/a&gt; – that data is a valuable, sustainable resource that can change the way you view your customers and run your business in fundamental ways - forever.&lt;br/&gt;&lt;br/&gt; The ideas that stem from the analytics solutions we enable for our customers are much more than abstract formulas on a virtual chalkboard. They result in real, positive, lasting change. Our technology at Netezza becomes a source of inspiration, a catalyst to a new way of thinking about the way people interact with the world and with each other.&lt;/span&gt; &lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;&lt;br/&gt;&lt;br/&gt;&lt;/span&gt; &lt;!--[if !supportLineBreakNewLine]--&gt; &lt;!--[endif]--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;&lt;strong&gt;BUSINESS ANALYTICS COME OF AGE&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;The energy and enthusiasm we experienced at &lt;a class="jive-link-external-small" href="http://www.netezza.com/userconference/home.html"&gt;2010 Enzee Universe&lt;/a&gt; this June on Boston's waterfront proved to me that we have reached a tipping point when it comes to accepting data analytics on a much deeper, broader level.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;object height="350" width="425"&gt;&lt;param name="movie" value="http://www.youtube.com/v/yFPN1cE7Abk"/&gt;&lt;param name="wmode" value="transparent"/&gt;&lt;embed height="350" src="http://www.youtube.com/v/yFPN1cE7Abk" type="application/x-shockwave-flash" width="425" wmode="transparent"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;As I said in my keynote address, Netezza embraces a set of core principles that are simply hard for others to reproduce:&lt;/span&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;Your success as our customer is our top objective&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;We embrace the fact that we are part of an ecosystem of partners and technologies that provide you with a complete solution &lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;Simplicity is the key to making analytics more accessible and a fundamental driver behind everything we do&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;We want to be a company that is easy to do business with and one that is viewed as a partner, not just a vendor&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;We value our customers’ input and integrate it into our thinking every day&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;Many of our customers are solving problems they could previously only DREAM of with a level of agility and simplicity that has allowed them to apply their energy to the business opportunity, freeing them from the complexities of the platform to focus on what really matters, the business solution.&lt;br/&gt;&lt;br/&gt; This, in a nutshell, is what is enabling the Intelligent Economy to emerge: the enormous amounts of human time and energy previously required to simply process information can now go toward inspiration, learning, and improving.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"&gt;Over the next few blog posts, I will be writing about the real-life experiences of some of the Intelligent Enterprises that make up this emerging Intelligent Economy – organizations that are using high-performance data analytics to make critical time-sensitive decisions, predict future events, and change they way they do business for the better, forever. Stay tuned.&lt;/span&gt;&lt;/p&gt;&lt;!--EndFragment--&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:592214aa-1711-4def-bac5-756b95936603] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">enzee</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">bi_and_analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">intelligent_enterprise</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">enzee_universe</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_intelligence</category>
      <pubDate>Mon, 02 Aug 2010 19:55:37 GMT</pubDate>
      <author>jbaum</author>
      <guid>http://www.enzeecommunity.com/blogs/ceoblog/2010/08/03/netezza-and-data-analytics-as-a-gateway-to-the-intelligent-economy</guid>
      <dc:date>2010-08-02T19:55:37Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/ceoblog/comment/netezza-and-data-analytics-as-a-gateway-to-the-intelligent-economy</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/ceoblog/feeds/comments?blogPost=1161</wfw:commentRss>
    </item>
    <item>
      <title>Rewiring our thinking - a view to a kill</title>
      <link>http://www.enzeecommunity.com/blogs/grill/2010/07/22/rewiring-our-thinking---a-view-to-a-kill</link>
      <description>&lt;!-- [DocumentBodyStart:2e9df8dd-dbd9-4214-a3c4-e07e2fd7a596] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;In the past several projects, the issue of using views has consistently arisen, as to when to use, not to use and what to expect. Views are one of those mainstay workhorses that we love to hate and sometimes hate to love, but used correctly, can save a world of hurt and lost development time.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So we would ask the question - why use a view at all? Isn't the table definition good enough? And what of a synonym? Isn't this just as good?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Well, synonyms are handy for configuration management and invaluable for testing. For BI, however, they don't pass-thru the metadata of their underpinning relation's metadata for consumption by the BI tool, so this can be problematic. We also cannot refresh a synonym easily. It has to be dropped and then created in two operations, where view gives us the concurrency-protected operation of "create or replace view" and is &lt;em&gt;muey bien&lt;/em&gt;. More on synonyms in another blog entry.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Views can easily reach across databases, giving us the ability to stand-up a consumption-point that contains one-part tables and one-part views without having to push data around (very handy for say, a reference database that we want as an on-demand resource of fresh information). I'm a big fan of setting up consumption-point databases so that a user comes to a pre-designated place, not the master repository, to fulfill their information needs. This decouples the user from the master repository and gives us enormous freedom in the ongoing enhancement of their user experience.Views are the vehicle towards this goal.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Views also let us do on-demand case/when conversions and typecasting that can be completely encapsulated from the consuming process.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;And of course a really cool part about Netezza views - is that we can include as many columns as we want in its "select" clause, the view will not fetch them all, only the one's mentioned in the select that consumes the view - this is a win-win because otherwise it would fetch all of the columns and then drop the majority on the floor to deliver a few.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Views have the lightweight nature of a single SQL statement that can be easily installed, where a stored proc often contains multiple SQL statements. Both of these mechanisms serve to hide the logic from the BI tool. But think about this - would we use the stored proc as part of another join? Or would we expect to just select from the stored proc and consume an answer? The more complex the operation, the more we need to just select-and-consume, and take the burden off the BI tool to know more than it has to.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;A pernicious part of integrating BI tools is just that - expecting that it will &lt;em&gt;&lt;strong&gt;know&lt;/strong&gt;&lt;/em&gt; all it needs to know to interact with the Netezza MPP. This is - as you may painfully discover - a false expectation. Case in point, we might have a very creative intersection table between two large fact tables, and we can formulate a query that will browse the information-we-want in mere seconds. Then we plug in our BI tool and ask it to manufacture a query to do the same thing, but it struggles. Now we have to make a call - do we deploy the BI tool in the hopes that later releases will resolve this, or do we install a view or stored procedure that adapt the BI tool to our data model, and then wait for the BI tool to get better in a later release? You see, we can always toss the adaptation when our BI tool gets better. But we cannot allow our user-experience to languish on the same terms. More on this in another essay.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;So before I jump into a lot of other things we like about views, I'll address some of the above in their more malignant form.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;I'll loosely divide views into two buckets - simple and complex. The simple view consumes a single table and may have columnar transforms on it. A complex view, simply put, has more than one table in the join logic.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;A simple view cannot be easily misused, but a complex view can be misused so easily it will make the head spin on your best troubleshooter. For example, I cannot count the times I've seen a case where a master query joining on a view, which in turn joined on a view, which in turn joined on a view etc. How deep can you go? This is not the issue at all. The issue is in treating the view as though it is a reusable, inheritable object rather than a standalone select-and-consume capability. So where do we draw the line?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Transactional thinking - that is - the notion that we can install nested (inherited) views because they handle transaction-at-a-time anyhow and any given instance of them will have a negligible performance problem - is completely washed away when dealing with multi-billion-row scales on a Netezza platform. It's not a transactional platform, so each view potentially initiates a full table scan. Multiply these nested upon nested views and we have nested tables scans - sometimes &lt;em&gt;several separate scans on the same table&lt;/em&gt;. Which is more efficient, to look at a multi-billion row table one, or multiple times?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;One customer had a query that started running very slow one day. We went through a process of discovery to find out what had changed. Seems that a new version of an existing view had&lt;br/&gt;been installed, and the bad query was consuming this view deep under the covers. The bad query and view  were both accessing one of the largest tables in the database, the bad query was now scanning the big table twice, taking a double-hit on the master query itself. Even worse, the changed view did not leverage the big table's zone maps or its distribution key. So a change in one place dramatically affected unchanged functionality of a master query.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Because we are embracing economies of extraordinary scale, dynamic objects have a propensity to lose performance integrity over time. What worked yesterday may not work today, so we have to tune it. Netezza is so efficient that this tuning necessity may not arise for years after the implementation. (In one case, four years afterward). By that time, the knowledge of the system's dependencies are not fresh on everyone's mind, so it is easy to make a spot-fix on the view and deploy it. In so doing, we may create a cascading effect for all the other places that consume the view and do so with the expectation of original behavior. In short, the latent nested view architecture is a minefield. We should not implement it because it creates trouble from day one, even though nobody has stepped on mine just yet.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;At one customer site we had to sift through six levels of view logic to find the performance problem. The customer wanted to know what they should do to fix the problem, but "the problem" was in the overall inplementation and the nested views, not the one bad view, or for that matter, the recent performance symptoms of a minefield implementation.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Views can behave as traditional objects if they are single-table views or they leverage additional tables that are small and inconsequential to performance. Don't ever include a big-fat table in a view as part of a performance boosting strategy unless you can designate that the view is in fact a standalone entry point and not something that can arbitrarily participate in the JOIN clause of another master query. Why is this? Invariably we will forget the complexity of the view and then attempt to join it in another operation. For a BI tool, this could be highly problematic as well, because a view that was once simple could spontaneously go complex, and if it affects performance, we'll be pulling our hair out to find the problem through what reduces to a scavenger hunt, or worse, a submarine hunt.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Many BI tools simply choke on automatically forming a "complex" Netezza query because there is an implicit assumption of indexes via primary keys, and if these don't exist, the BI tool does the best it can, which in many cases is the least-common-denominator of a query structure. This this doesn't play well on the SPUs for large-scale queries. I cannot count how many times I've seen a convoluted query that we just de-engineered and simplified, and ran an order-of-magnitude faster than the one conjured by the BI tool, yet nothing the tool folks could do seemed to make the BI tool form it the same way. To the rescue: a view that did the right thing - and that was that.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;What's that? Putting together a view diminishes the flexibilty of the query? Only marginally, and since we're dealing with billions of rows, we don't have much runway for "ultimate" flexibility anyhow. The larger the datasets, the more we need to make sure the queries are as efficiently formed as possible. And since this means as simply formed as possible, we're not talking about BI Tool query engineering, but query &lt;em&gt;de-engineering.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;br/&gt;To avoid pain and injury, don't treat all views the same. If we have a complex view, we should tune and designate it as standalone. No matter how much we like its results, it is better not to just arbitrarily include it in another join. the primary reason being - most views are not set up to regard a distribution. So when we include it with our other join, the resolution of distribution might take the form of of least-performing, lower-denominator. We don't want that.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;One alternative, oddly, is to CTAS - execute such a view in context and insert its data into a temporary table, then use the temporary table in the master join. This affords us the option to (a) leverage the view's normally small output (b) preserve the distribution or (c) align distribution to the next operation (d) simplify the implementation. Of course, your BI tool may not support this, or may support it in an inefficient fashion. Most of the major BI tools will accommodate advanced scenarios, so get your product support rep on the wire and have a heart-to-heart.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Yet another alternative is to use the view like an in-line view, except in the where-clause in correlated sub-query. This can often take the form of a where-not-exists clause or the like and can also be very efficient.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Another alternative is to break apart the view's logic and assimilate it into the larger view so that all logic is preserved. But you'll be maintaining that logic in two places, right? Not necessarily. We have a lot of view DDL executables that do not directly spawn from a modeling tool. Several of those being in BASH script, which provides for parameterization of logic. If we put the logic into a parameter, then produce the views by including the parameterized logic, we will maintain the core logic in one place (script) but actually deploy two views that leverage it. This is essentially what happens under the covers with many object-oriented environments anyhow. Multiple objects will consume another class and deploy an instance that includes that class, so this approach embraces that inheritance pattern. Not in the dynamic run-time of the view, but in the view's initial DLL-level deployment.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;MYLOGIC=$( cat &amp;lt;&amp;lt;!&lt;br/&gt;a.limit1 between 50 and 60 and&lt;br/&gt;a.limit2 between 1000 and 50000 and&lt;br/&gt;a.tran_amt &amp;lt; 10000 and&lt;br/&gt;b.employee_id &amp;lt;&amp;gt; 9999&lt;br/&gt;!&lt;br/&gt;)&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;view1="create view view1 as select col1, col2, col3 from mytable a where $MYLOGIC ;"&lt;/p&gt;&lt;p&gt;&lt;br/&gt;view2="create view view2 as select col1, col2, col4 from mytable a join yourtable b on a.id = b.id where a.col1 = b.col1 and $MYLOGIC ;"&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;If our modeling tool supports this capability as part of its functionality, and we should leverage it before simply bolting a view into a join. If our modeling tool does not support it, this scripted DDL scenario is easy enough to formulate and leverage without a lot of overhead. The objective: two views that both behave as optimized joins, rather than one view that behaves as a join-with-a-view.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Either way, there is a theme here, that simply including the complex view as part of another join's  logic - as though it was a table - is risky and can, even at the outset, offer up such bad performance as to be a non-starter. So a plain-vanilla practice should be to make the complex view behave in a standalone query-and-consume fashion by default. Make no assumptions that it is okay to arbitrarily include it in a larger query's join clause.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;The further downside is that a de-facto join-with-a-view can work really well at the outset, but the scale of the data can catch up to the even the most robust of implementations, and wiring up the complex view dependencies creates a problem that will not scale, but will only become obvious over time (a minefield)&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;One group invoked a standard for view naming conventions. The simple views would have no prefix at all, so they would look like tables to the casual user. Fair game and all that. The complex views were labeled as v_&amp;lt;viewname&amp;gt; as a cue to a user or report builder: don't use it in the join of a larger query. You'd think that if there was an implicit rule to avoid using anything "v_" prefix that people would play nicely. But not so, since your reporting users may have come from a RDBMS background where it's perfectly okay to mix views into the master query. Awareness of the standard is one thing, but actually embracing it is another. We cannot protect our systems from people who either don't know the rules, don't understand them, or cannot map their experiences from an RDBMS to an MPP.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So a suggestion here would be to name the view in a manner that is a departure from common view nomenclature. Calling it an sp_(NAME) might draw the ire of your admins who want stored procs named for what they are, and not obfuscate their names. But if our views are not really common views, and have caveats on their usage, we need a safer naming convention, one that aligns with the goal we are trying to achieve - that of adapting the BI tool to the MPP. One group used a naming convention of "bi_", while another used "rpt_", and still another used the common acronym for their given BI tool. The point is to adopt a convention that is somewhat unconventional, so that those with conventional thinking are able to transform their thinking without finding themselves in a minefield.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Nothing is worse than overlooking a minefield - it's a scary view - a view to a kill.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:2e9df8dd-dbd9-4214-a3c4-e07e2fd7a596] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">view</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_intelligence</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">microstrategy</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">adaptation</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">stored_procedure</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">consumption_point</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">architecture</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">views</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">reporting</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_objects</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">adapt</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">fit_for_purpose</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">users</category>
      <pubDate>Thu, 22 Jul 2010 15:14:14 GMT</pubDate>
      <author>dbirmingham</author>
      <guid>http://www.enzeecommunity.com/blogs/grill/2010/07/22/rewiring-our-thinking---a-view-to-a-kill</guid>
      <dc:date>2010-07-22T15:14:14Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/grill/comment/rewiring-our-thinking---a-view-to-a-kill</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/grill/feeds/comments?blogPost=1160</wfw:commentRss>
    </item>
    <item>
      <title>Hadoop &amp; Netezza: Synergy in Data Analytics – PART 2</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/07/22/hadoop-netezza-synergy-in-data-analytics-part-2</link>
      <description>&lt;!-- [DocumentBodyStart:6cec7cb9-1c0d-45b4-8197-746774b95b52] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: left;"&gt;&lt;span style="font-family: arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: left;"&gt;&lt;span&gt;I mentioned in my previous post that Netezza is excited about our partnership with Cloudera and Hadoop because we’ve already seen some of our customers benefit from the synergy of Hadoop and Netezza TwinFin™ technologies working together.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;As I noted, these types of strategies play to the strengths of both technologies and roughly break down into two categories: 1)&lt;/span&gt; the &lt;a class="jive-link-external-small" href="http://bit.ly/bt4esN"&gt;use of a Hadoop Cluster for data ingestion&lt;/a&gt;&lt;span&gt;, and 2) using a Hadoop Cluster for long-term data retention, which I’m addressing today.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;Netezza TwinFin with a Hadoop Cluster Used for Queryable Archive Analytics&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;The second pattern we have seen customers deploy is one in which the Hadoop Cluster is used for long-term data retention, or as a “queryable archive”. Here one could think of Hadoop as a complementary analytic extension of the Netezza TwinFin when there is far less premium placed on low-latency or high-performance. In addition to the weblog and unstructured data analysis discussed in Pattern 1, the queryable archive could also retain long-term copies of structured data that had previously been loaded into the high-performance TwinFin appliance.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1159-1600/Hadoop-NZ+3.jpg"&gt;&lt;img alt="Hadoop-NZ 3.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1159-1600/Hadoop-NZ+3.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;strong&gt;Hadoop Cluster Used for Queryable Archive&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;With a mix of structured, semi-structured and unstructured data loaded across the two complementary systems, customers can alter the level of granularity and data retention periods across each and typically use TwinFin for processing “hot” data and the Hadoop Cluster for processing “cool” or “cold” data, perhaps with specialized analytics. A deployment of this pattern could look like the following diagram:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1159-1601/Hadoop-NZ+Arch+3.jpg"&gt;&lt;img alt="Hadoop-NZ Arch 3.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1159-1601/Hadoop-NZ+Arch+3.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Readers should view this pair of posts as a “point-in-time” look at the market. Our customers continue to innovate and make use of the complementary strengths of TwinFin and Hadoop. And Netezza will continue to innovate both inside the appliance – adding performance, scale, workload management capabilities and especially with the advanced analytics of i-Class, through partnerships like the one&lt;/span&gt; &lt;span&gt;announced&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/releases/2010/release071510.htm"&gt;with Cloudera&lt;/a&gt; &lt;span&gt;a week ago, and through expansion of our platform, software and virtualization capabilities beyond the TwinFin and Skimmer™ appliances. Those innovations should help alter and/or enhance some of the deployment directions discussed here.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;Now, as I said at the outset of these two posts, I’d like to hear from you on your Netezza &amp;amp; Hadoop co-existence deployment and/or compatibility wish-list ideas. What would you like to see?&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:6cec7cb9-1c0d-45b4-8197-746774b95b52] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">twinfin</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">cloudera</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">i-class</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">bi</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">hadoop</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">sql</category>
      <pubDate>Wed, 21 Jul 2010 01:04:26 GMT</pubDate>
      <author>pfrancisco</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/07/22/hadoop-netezza-synergy-in-data-analytics-part-2</guid>
      <dc:date>2010-07-21T01:04:26Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/hadoop-netezza-synergy-in-data-analytics-part-2</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1159</wfw:commentRss>
    </item>
    <item>
      <title>Hadoop &amp; Netezza: Synergy in Data Analytics Results in New Customer Deployment Trends – PART 1</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/07/20/hadoop-netezza-synergy-in-data-analytics-results-in-new-customer-deployment-trends-part-1</link>
      <description>&lt;!-- [DocumentBodyStart:3a30bca8-7d8d-4b71-bb45-7609f818dd09] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;div class="jive-quote"&gt;&lt;p class="MsoNormal"&gt;&lt;em&gt;Two things before I begin:&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;em&gt;&lt;span&gt;I’ll begin this posting with a call for inputs. Below I will list a few of the most common Hadoop/Netezza co-existence deployment patterns we have seen to date. But I would like to hear from others. As you see the continuing deployment of Hadoop in the enterprise and as the&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/releases/2010/release062110_4.htm"&gt;Second Wave of TwinFin™&lt;/a&gt; &lt;span&gt;comes on with the advanced analytics capabilities of i-Class, how do you see the evolving deployment patterns happening in your environment?&lt;/span&gt;&lt;/em&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;em&gt;&lt;span&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;em&gt;&lt;span&gt;A special hat-tip to Krishnan Parasuraman, Netezza’s Chief Architect for our Digital Media group, for his excellent help in aiding and abetting this post! I have used his guidance gratefully and (with his permission) stolen freely from some of his inputs.&lt;/span&gt;&lt;/em&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;You may have noticed a partnership announcement made by&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/releases/2010/release071510.htm"&gt;Cloudera and Netezza&lt;/a&gt; &lt;span&gt;late last week. Together with Cloudera, Netezza will open up data movement and transformation between Cloudera’s Distribution for Hadoop and the Netezza family of appliances applications and data flows for integration of the two systems. We expect that our partnership with Cloudera, together with the Hadoop support in Netezza’s i-Class™ set of advanced analytics capabilities that are included as part of the upcoming release 6.0 software release, will lead to some very innovative and expansive applications for our customers and for both companies.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Even today, Netezza customers are doing some very interesting things with deployment of Hadoop and our TwinFin data warehouse appliance. Far from being the “Hadoop v. SQL” battle that some people might like to make the current market out to be, we have instead noticed a growing number of “co-existence” deployment strategies and design patterns already at work with our customers – particularly among customers in the&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/data-warehouse-appliance-industries/digital-media.aspx"&gt;“Digital Media”&lt;/a&gt; &lt;span&gt;vertical market.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;These types of strategies can play to the strengths of both technologies and roughly break down into two categories: 1) the use of a Hadoop Cluster for data ingestion, which I’ll write about in further detail today; and 2) using a Hadoop Cluster for long-term data retention, or as a “queryable archive,” for which I’ll go into further detail in a post later this week.&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;Using a Hadoop Cluster for Raw Data Ingestion&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;The use of a Hadoop Cluster as the engine for data ingestion is the most common “co-existence” pattern we see in our customers’ mutual deployments of Hadoop and Netezza. The deployment pattern typically arises when the customer has hit specific performance and processing throughput scalability limitations with their existing Data Integration or ETL implementation.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Raw weblog data is the primary data source for most Digital Media analytics and reporting requirements. Weblogs are data rich (e.g., page views, impressions, click-throughs and demographics collected from applications servers). They are typically semi-structured and collected and stored in flat files.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;There are some critical facts about weblogs that present real performance challenges in processing them:&lt;/span&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;em&gt;&lt;span&gt;sheer volume&lt;/span&gt;&lt;/em&gt;&lt;span&gt;: millions of rows of weblog data collected throughout the day and loaded daily into the data warehouse;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;&lt;span&gt;complex query processing&lt;/span&gt;&lt;/em&gt;&lt;span&gt;: parsing and decoding encoded character strings requires text processing, pattern matching, tokenizing type capabilities within the ETL process&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;&lt;span&gt;non-conformed dimensions&lt;/span&gt;&lt;/em&gt;&lt;span&gt;: collecting page views or impression data defined and represented differently by various systems makes fitting them into conformed dimensions is another very common data ingestion &amp;amp; processing challenge.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;There are two common variants of this pattern – dealing with semi-structured (e.g., weblogs) and unstructured (e.g., text) data and often customers will have versions of both variants in operation simultaneously.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1158-1599/Hadoop-NZ+2.png"&gt;&lt;img alt="Hadoop-NZ 2.png" class="jive-image-thumbnail jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1158-1599/Hadoop-NZ+2.png" width="620"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;strong&gt;Semi-structured data ingest via Hadoop&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Semi-structured data is parsed (and possibly aggregated as well) in the Hadoop Cluster and then loaded into a TwinFin where the performance and workload scaling of the appliance is important for deeper analysis, higher throughput and faster reporting.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;span&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1158-1598/Hadoop-NZ+1.jpg"&gt;&lt;img alt="Hadoop-NZ 1.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1158-1598/Hadoop-NZ+1.jpg"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: center;"&gt;&lt;strong&gt;Unstructured data ingest via Hadoop&lt;/strong&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Unstructured data in this pattern is contextualized (classified, mined, keyworded and indexed) in Hadoop and then moved into a Netezza TwinFin appliance for the low-latency, high-performance analytics used to drive business decisions.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;A Hadoop Cluster&lt;/span&gt; &lt;span&gt;provides a scalable ingestion mechanism that is well suited for addressing the challenges described above. The Cluster can be incrementally scaled to handle ingesting the massive volumes of weblog data and it can support text processing and complex data processing through programming languages such as Java or Python.&lt;/span&gt; &lt;em&gt;[Note that with the coming i-Class set of analytics functionality, the programmability and some of the complex data processing may also be possible on the TwinFin, depending on a customer’s applications needs or preference.]&lt;/em&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;Following the data ingest steps, processed weblog information is brought into TwinFin as atomic event information or as summarized tables, depending on the size of the appliance and analytic maturity &amp;amp; scale of the organization where it is deployed. A typical deployment might look like the following diagram:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1158-1596/Hadoop-NZ+Arch+1.jpg"&gt;&lt;img alt="Hadoop-NZ Arch 1.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1158-1596/Hadoop-NZ+Arch+1.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;An alternate, far less common, deployment design of the above co-existence pattern is used by some of our customers. That is the use of an external elastic MapReduce cloud (such as the Amazon Cloud) for the data ingestion purposes.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;In cases where the customer may have its application servers in the Amazon’s EC2 cluster, they may also choose to use Amazon’s S3 web services for retaining weblog data. In that case, Amazon would provide the elastic MapReduce infrastructure for the data ingest process into the TwinFin appliance. This alternative deployment scenario would look something like the following:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;a href="http://www.enzeecommunity.com/servlet/JiveServlet/showImage/38-1158-1595/Hadoop-NZ+Arch+2.jpg"&gt;&lt;img alt="Hadoop-NZ Arch 2.jpg" class="jive-image" src="http://www.enzeecommunity.com/servlet/JiveServlet/downloadImage/38-1158-1595/Hadoop-NZ+Arch+2.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--StartFragment--&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span&gt;The bottom line is that the different strengths of TwinFin and Hadoop lend themselves to complementary deployments – and some of our customers have already discovered innovative ways to leverage them together to maximize the value of both their investments.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;In my next post, I’ll discuss the second pattern we’re noticing: one in which Netezza customers are using the Hadoop Cluster for long-term data retention.&lt;/strong&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;!--EndFragment--&gt;&lt;!--EndFragment--&gt;&lt;!--EndFragment--&gt;&lt;!--EndFragment--&gt;&lt;!--EndFragment--&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:3a30bca8-7d8d-4b71-bb45-7609f818dd09] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">i-class</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">twinfin_i-class</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">twinfin</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">cloudera</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">mapreduce</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">advanced_analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">etl</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">hadoop</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">java</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">sql</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">business_intelligence</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_integration</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">python</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">price-performance</category>
      <pubDate>Tue, 20 Jul 2010 02:43:27 GMT</pubDate>
      <author>pfrancisco</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/07/20/hadoop-netezza-synergy-in-data-analytics-results-in-new-customer-deployment-trends-part-1</guid>
      <dc:date>2010-07-20T02:43:27Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/hadoop-netezza-synergy-in-data-analytics-results-in-new-customer-deployment-trends-part-1</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1158</wfw:commentRss>
    </item>
    <item>
      <title>A Little Late Musing on the Greenplum Acquisition</title>
      <link>http://www.enzeecommunity.com/blogs/daiclegg/2010/07/13/a-little-late-musing-on-the-greenplum-acquisition</link>
      <description>&lt;!-- [DocumentBodyStart:9fcee3de-2bb8-4de3-8a48-dcbd4bdac3ae] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: verdana, geneva; color: #000000;"&gt;I wasn’t planning to blog about EMC’s acquisition of Greenplum since Phil Francisco has commented&lt;/span&gt; &lt;a class="jive-link-external-small" href="/blogs/emc-swallows-a-green-plum"&gt;&lt;span style="font-family: verdana, geneva;"&gt;here&lt;/span&gt;&lt;/a&gt; &lt;span style="font-family: verdana, geneva; color: #000000;"&gt;and many others, more well qualified than me, have had their say (eg&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.dbms2.com/2010/07/06/emc-is-buying-greenplum/?utm_content=Google+Reader"&gt;&lt;span style="font-family: verdana, geneva;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: verdana, geneva;"&gt;), but it did occur to me that one point this illustrates is how data warehousing got interesting again after ten years as a bit player in the drama of information technology. Suddenly, led i have to say by Netezza back in 2003, &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt; a whole generation of disruptive innovators have entered what was a stagnant market of established players and are redefining the segment. Richard Hackathorn said as much at EnZee Universe.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: verdana, geneva;"&gt;And it's what i had that in mind last week talking with someone from one of Greenplum’s &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;software-only competitors. His take was that the choice was data warehouse machine or software only on a self-assembly hardware configuration (sounds like a grid from IKEA – i’d like to see the allen key in that kit). And Greenplum had decided warehouse machine was the way to go. Of course i wouldn’t be a good Netezza corporate citizen if i didn’t observe that there’s two classes of data warehouse machine: the true appliance (Netezza, out-of-the-box) and the customized data warehouse machine (either vendor-assembled hardware configuration or workload-specific tuned database on commodity hardware, or both). &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: verdana, geneva; color: #000000;"&gt;That doesn’t alter what i took away from the conversation, which was that Greenplum tried the software-only route and then plumped for the machine option. Of course they may well have been made an offer they couldn’t sensibly refuse (in the original sense not the horse’s-head-in-the-bed sense). If so, i guess it’s a case of EMC not being content to be the optional storage component in a configured data warehouse machine and indulging in a bit of supply chain integration. These are interesting times for vendors, customers and, as ever in such situations, analysts and consultants.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:9fcee3de-2bb8-4de3-8a48-dcbd4bdac3ae] --&gt;</description>
      <pubDate>Tue, 13 Jul 2010 14:37:21 GMT</pubDate>
      <author>dclegg@netezza.com</author>
      <guid>http://www.enzeecommunity.com/blogs/daiclegg/2010/07/13/a-little-late-musing-on-the-greenplum-acquisition</guid>
      <dc:date>2010-07-13T14:37:21Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/daiclegg/comment/a-little-late-musing-on-the-greenplum-acquisition</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/daiclegg/feeds/comments?blogPost=1157</wfw:commentRss>
    </item>
    <item>
      <title>Mobile BI and the Vuvuzela Queen</title>
      <link>http://www.enzeecommunity.com/blogs/daiclegg/2010/07/07/mobile-bi-and-the-vuvuzela-queen</link>
      <description>&lt;!-- [DocumentBodyStart:88322f7e-6eb2-4fb4-bc69-30c433afb531] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: Calibri; color: #000000; font-size: 12pt;"&gt;I took a little time out yesterday to think about the implications of a possible huge upsurge in mobile BI apps. You don’t have to share&lt;/span&gt; &lt;a class="jive-link-external-small" href="/blogs/your-mother-is-a-hamster"&gt;&lt;span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"&gt;&lt;span style="font-family: Calibri; font-size: 12pt;"&gt;Michael Saylor’s unwavering belief&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;&lt;span style="font-size: 12pt;"&gt;that iPad and iPhone will be the dominant delivery&lt;/span&gt;&lt;span style="font-size: 10pt;"&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;platform in order to acknowledge that there is a real opportunity here. The use cases that I have heard talked about seem to fall into two distinct classes. The first type would be role-based dashboards, for example top ten best movers/worst movers/best aggregate margin etc for a retail store manager. These might be database-intensive queries, but they would generally be cached because they would be slow moving. The second type of query would be ad-hoc requests for specific data. For example ‘what’s the inventory for product xxx?’ These could be run very frequently, with different &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt; parameters, but would not be database intensive. And both kinds of queries would be pre-baked into the app that the user downloads to their mobile device. All of this seems eminently doable for MSTR using their new mobile app development toolset and their intelligent server run time.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: Calibri; color: #000000; font-size: 12pt;"&gt;And the same might well be true for any other comparable app development platform. The critical success factor here is not the deployment platform, though some folks may get burned there depending on how the fight for dominance plays out. The critical success factor is fast response time. A BI specialist running &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;operational reports or complex predictive analytics across the whole dataset may be happy with a multi-second or even minutes response time. But a mobile user has to have the answer quickly or it’s not worth having. From a Netezza perspective this is all music to my ears. Any kind of app that needs fast access to a mass of data is going to need the fastest database they can put at the back of it. We have already partnered with Microstrategy and Quantisense to deliver the&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.netezza.com/data-warehouse-appliance-products/raa.aspxhttp:/www.netezza.com/data-warehouse-appliance-products/raa.aspxhttp:/www.netezza.com/data-warehouse-appliance-products/raa.aspxhttp:/www.netezza.com/data-warehouse-appliance-products/raa."&gt;&lt;span style="font-family: Calibri; color: #0000ff; font-size: 12pt;"&gt;Retail Analytics Appliance&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: Calibri; color: #000000; font-size: 12pt;"&gt;. &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;It will be interesting to see how the market for mobile BI apps develops. &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt; I think it’s clear that retail is a fertile segment, but not the only one.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: Calibri; color: #000000; font-size: 12pt;"&gt;As you do, if you’re a partner-vendor exhibitor at a conference, you bring along what i’ve always called a twomm (to rhyme with from) - total waste of marketing money, otherwise known as the stand giveaway. &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;Of course they are not a waste of money they represent a unique opportunity to build lasting positive brand identification, blah, blah, blah. But it being world cup semi-final week and with it being my choice, we have vuvuzelas as our giveaway. And give ‘em away we have. I’m not even sure i’ll have one to take back to my Dutch footballing friend who i was texting with as I watched their game (1-0 pleased : 1-1 worried: 2-1 confident:3-1 triumphant: 3-2 clenching) &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;against Uruguay at the conference beach party (thanks Microstrategy – good party). Next to me throughout the second half was a tall Dutch (from appearance, mien and allegiance even if she had said nothing) woman doing a great job with a Netezza vuvuzela. Thank you madam for promoting our brand, and congratulations on a (just about) deserved victory. I’m hoping for more branding opportunities tonight. Not sure who i want to win: &lt;span style="mso-spacerun: yes;"&gt;&lt;/span&gt;hugely talented Spain or unexpectedly fluent and uninhibited Germany. Or should i say who would i prefer to lose: envy-inducing talented Spain or England-crushing Germany.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0cm 0cm 10pt;"&gt;&lt;span style="font-family: Calibri; color: #000000; font-size: 12pt;"&gt;Anyone get that somewhat torturous music reference in the title?&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:88322f7e-6eb2-4fb4-bc69-30c433afb531] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">netezza</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">big</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">mobile-bi</category>
      <pubDate>Wed, 07 Jul 2010 09:46:54 GMT</pubDate>
      <author>dclegg@netezza.com</author>
      <guid>http://www.enzeecommunity.com/blogs/daiclegg/2010/07/07/mobile-bi-and-the-vuvuzela-queen</guid>
      <dc:date>2010-07-07T09:46:54Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/daiclegg/comment/mobile-bi-and-the-vuvuzela-queen</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/daiclegg/feeds/comments?blogPost=1156</wfw:commentRss>
    </item>
    <item>
      <title>EMC Swallows a Green Plum?</title>
      <link>http://www.enzeecommunity.com/blogs/nzblog/2010/07/07/emc-swallows-a-green-plum</link>
      <description>&lt;!-- [DocumentBodyStart:aafe1558-f331-436b-8a8f-d1d94b20bf23] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;!--StartFragment--&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;News broke on Tuesday that&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;a class="active_link" href="http://www.emc.com/about/news/press/2010/20100706-01.htm"&gt;EMC plans to acquire Greenplum&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;to focus on data warehousing and analytics on “big data”. The idea is that by doing so, EMC is officially throwing its hat into the competitive ring for the ‘Data Warehouse Appliance’ (DWA) market – something of a defensive mechanism now that virtually all of the major data warehouse vendors are now selling their own versions of a DWA – and consequently&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;strong&gt;greatly&lt;/strong&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;reducing sales pull-through of EMC storage for data warehouse deployments.&lt;br/&gt;&lt;br/&gt; Some referred to the merger as “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;a class="jive-link-external-small" href="http://www.dbms2.com/2010/07/06/emc-is-buying-greenplum/"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;a good fit for a storage vendor with appliance-y ideas&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;” and others hailed it as follows, “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;a class="jive-link-external-small" href="http://wikibon.org/blog/emc-picks-a-greenplum/"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;the market has shifted as of late moving toward integrated appliances and this move gives EMC a very important arrow in its quiver&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;” and labeled Greenplum as a purveyor of “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;very high performance database systems&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;”.&lt;br/&gt;&lt;br/&gt; One can also reasonably assume that this acquisition not only is intended to shore up a product offering weakness, but that it is also destined for affiliation with EMC’s other major initiative announced earlier this year – the&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;a class="active_link" href="http://www.prnewswire.com/news-releases/cisco-and-emc-appoint-michael-d-capellas-to-lead-vce-coalition-named-ceo-of-acadia-joint-venture-92925469.html"&gt;Acadia Virtual Computing Environment (VCE) Joint Venture with Cisco Systems&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;and headed up by Michael Capellas. The Acadia JV includes EMC’s storage and its VMWare virtualization software as well as Cisco Systems’ compute nodes and networking. VCE is built on the concept of modular building blocks,&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;a class="active_link" href="http://www.emc.com/collateral/brochure/h6703-br-vce-external-vblock-package.pdf"&gt;called vblocks&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;that marry computing horsepower to storage capacity. All that’s missing from that story is a data warehouse DBMS to make it a full-on data warehouse appliance, right?&lt;br/&gt;&lt;br/&gt; There are two big problems with these assumptions…&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;Performance:&lt;/strong&gt; &lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;For all the discussion about “scale” and  “big data” in the EMC announcement, there&lt;/span&gt; &lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;is&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;span style="text-decoration: underline;"&gt;no mention&lt;/span&gt; o&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;f&lt;/span&gt; &lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;how either party can address the real issues that mainstream enterprises face every single day with their data warehouse systems – how to get maximum performance out of a complex, highly concurrent operational environment where hundreds if not thousands of users are banging away on the system, night and day.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;The fact is that the actual Greenplum target market has clearly NOT been one that focused on high-performance analytics over the past several years. Instead, the few wins publicly announced by the company have been for very high capacity, limited compute platforms – applications more commonly referred to as “queryable archive”.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;Curt Monash today again mentioned&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;a class="active_link" href="http://www.dbms2.com/2010/07/06/emc-is-buying-greenplum/"&gt;Greenplum’s lack of support for the “high-concurrency”&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;requirements of a mainstream data warehouse.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;This looks much more like adding a very basic set of storage-centric data warehousing capabilities in &lt;a class="jive-link-external-small" href="http://chucksblog.emc.com/chucks_blog/2010/05/once-upon-a-time.html"&gt;a move to find a broader channel for EMC’s traditional storage products&lt;/a&gt; rather than any strategic move into the world of high performance data analytics. Further to this point, neither company has done much of anything to address a very strong trend in the mainstream data warehouse market – the marriage of advanced, predictive analytics into the busy data warehouse systems.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;a class="active_link" href="http://wikibon.org/blog/emc-picks-a-greenplum/"&gt;David Vellante&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;confirmed that to be successful the EMC/Greenplum marriage will need to yield, “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;optimized sytems&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/em&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;[sic]&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;; smokin’ fast performance; reference architectures; scale;”&lt;/span&gt;&lt;/span&gt; &lt;span style="font-style: normal;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;and&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;“federation capabilities; not just big honking systems.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;” We couldn’t agree more but one can’t help but notice that neither Greenplum nor EMC have brought any of those characteristics to market for data warehousing to date.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;strong&gt;Appliances:&lt;/strong&gt; &lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;Since the acquisition is fairly transparent in its defense against moves by the likes of Oracle, Teradata and IBM (as well as Netezza seven years ago) to the appliance model, it’s hard to see how either EMC or Greenplum are effectively equipped now to do battle against those established players.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;EMC have never really “sold” data warehousing to anyone previously and Greenplum have nearly prided themselves in going after “Greenfield” high capacity applications rather than head-to-head competition vs. established players. And one need look no further than the limited market penetration of H-P’s NeoView to understand that it takes more than simply deep pockets to succeed in the data warehousing market.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;Greenplum is not a purveyor of “integrated appliances” and at best, they can hope to infuse in EMC the ability to make their joint product offering a little more of an “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;appliance-y idea&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;” (hat tip to Dr. Monash for coining the term) to the market. Instead, Greenplum have fashioned themselves over the past several years as a software only solution.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;Assume that the&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;a class="jive-link-external-small" href="http://chucksblog.emc.com/chucks_blog/2010/07/emc-to-acquire-greenplum.html"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;Acadia VCE and “vblock” application is a big piece of this strategy&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;. Neither Cisco nor EMC would claim that their servers, networking or storage arrays offer the lowest price-per-bit or price-per-performance alternative in the market. So one needs to think about what that means in terms of the price-performance competitiveness of this new “&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;appliance-y&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/em&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;” joint product.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 14pt;"&gt;&lt;span style="font-size: 14pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;In short, Greenplum joins the pantheon of “interesting” acquisitions for EMC as it will certainly stir some news cycles and drive some analysts and bloggers to create “fresh, new” content; but it’s not really something that I think will register on the Richter scale of customer market share&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="font-size: 10pt;"&gt;&lt;span style="font-family: tahoma, arial, helvetica, sans-serif;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;!--EndFragment--&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:aafe1558-f331-436b-8a8f-d1d94b20bf23] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/tags">acadia</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">oracle</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">h-p</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">archive</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">analytics</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">emc</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">greenplum</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">cisco</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">vmware</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">dbms2</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">monash</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">wikibon</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">david_vellante</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">vce</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">vblock</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">chuck_hollis</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">storage</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">ibm</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">teradata</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">dwa</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">data_warehouse_appliance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">price-performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/tags">michael_capellas</category>
      <pubDate>Wed, 07 Jul 2010 05:18:39 GMT</pubDate>
      <author>pfrancisco</author>
      <guid>http://www.enzeecommunity.com/blogs/nzblog/2010/07/07/emc-swallows-a-green-plum</guid>
      <dc:date>2010-07-07T05:18:39Z</dc:date>
      <wfw:comment>http://www.enzeecommunity.com/blogs/nzblog/comment/emc-swallows-a-green-plum</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/nzblog/feeds/comments?blogPost=1155</wfw:commentRss>
    </item>
  </channel>
</rss>

