About a year ago we encountered an environment where the client wanted the old system refactored into the new. The "new" here being the Netezza platform and the "old" here being an overwhelmed RDBMS that couldn't hope to keep up with the workload.
So the team landed on the ground with all hopes high. The client had purchased a 10200 (216 processors) for production deployment and a 10100 system for development. Oddly, the same thing happened here as happens in many places. The 10200 was dispatched to the protected production enclave and the 10100 was dropped into the local data center with the developers salivating to get started. And get started they did.
The first team inherited about half a terabyte of raw data from the old system and started crunching on it. The second team, starting a week later, began testing on the work of the first team. A third team entered the fray, building out test cases and a wide array of number-crunching exercises. While these three teams dogpiled onto and hammered the 10100, the 10200 sat elsewhere, humming with nothing to do.
We know that in any environment we encounter, with any technoogy you can name, the development machines are underpowered compared to the production environment. And while the production environment has a lot of growing priorities for ongoing projects, we don't have this problem for our first project, do we?
And this is the irony - for a first project we have a huge "first-bubble" of work before us that will never appear again. the bubble includes all the data movement, management and backfilling of structures that we will execute only once, right? Really? I've been in places where these processes have to be executed dozens if not hundreds of times in a devlopment or integration environment as a means to boil out any latent bugs prior to its maiden - and only - conversion voyage. But is this a maiden-and-only voyage? Hardly - typically the production guys will want to make several dry runs of the stuff too. We can multiply their need for dry runs with ours, because we have no intention of invoking such a large-scale movement of data without extensive testing.
And yet, we're doing it on the smaller machine. No doubt the 10100 has some stuff - but I've seen cases where it might take us two weeks to wrap up a particularly heavy-lifing piece of logic. If we'd done this on the larger 10200, we would have finished it in a week or less. Double the power, half the time-to-deliver (when the time is deliver is governed by testing) In practically every case of a data warehouse conversion, the actual 'coding' and development itself is a nit compered to the timeline required for testing. I've noted this in a number of places and forms, in that the testing load for a data warehouse conversion is the largest and most protracted part of the effort. And if testing (as in our case) is largely loading, crunching and presenting the data, we need the strongest possible hardware to get past the first bubble.
So this is a case for any data warehouse project, not just one with a Netezza machine. The first bubble is the worst bubble. As our techs slave themselves over a hot CPU, sweating out the extreme workload of the initial conversion, they will quickly start to compare the machine they are working on versus that production machine sitting over there with nothing to do. It wouldn't matter what the technology happened to be - the equation is out of kilter. We need all the available power to get past the first bubble.
But I've had this conversaion with more people than I can count. Why can't you deploy the production-destined machine with all its power, for development/testing use in getting past the first bubble, then scratch the system and deploy for production? What is the danger here? I know plenty of people, some of them vendor product engineers, who would be happy to validate such a 'scratch' so that the production system arrives with nothing but its birthday-suit - its originally deployed default environment. Yet another philosophy is that we would pre-configure the machine for production deployment, but nobody likes developers doing this kind of thing in a vacuum. They would rather see deployment/implementation scripts that blow-out and instantiate the inplementation. I'm a big fan of that, too, for the first and every following deployment. That's why I would prefer we used the production-destined system to get past the first-bubble-blues, then scratch it, and get the original environment standing up straight, then treat it as an operational production asset.
Most projects like this have a very short runway, and we do a disservise to the hard-working folks who are doing their best to stand up this environment, They need all the power they can get, especially when they enter the testing cycle. And for this, it's an 80/20 rule for every technical work product we will ever produce. Take a look sometime at what it takes to roll out a simple Java Bean, or a C# application, or a web site. Part of the time is spent in raw development, and part of it in testing. If I include the total number of minutes spent by the developer in unit testing, and then by hardcore testers in a UAT or QA environment, and it is clear that the total wall-clock hours spent in producing quality technology breaks into the 80/20 rule - 20 percent of the time is spent in development, and 80 percent in testing.
And if the majority of the time is spent in testing, what are we testing on Enzee space? The machine's ability to load, internally crunch and then publish the data. On a Netezza machine, this last operation is largely a function of the first two. But we have to test all the loading don't we? And when testing the full processing cycle we have to load-and-crunch in the same stream, no? What does it take to do this? Hardware, baby, and lots of it. So why are we doing it on one-third of the available hardware (seeing that we're on a 10100 and the 10200 is sitting over there, humming away and taunting us from a distance!)
I can say that multiple small teams can get a lot of "ongoing" work done on a 10100, no doubt a very powerful environment. I can also say that a machine like this, for multiple teams in the first-bubble effort, will gaze longingly at the 10200 in the hopes they can get to it soon, because so much testing is still before them, and they need the power to close. With that, Netezza gives us the power to close faster than any other environment, to get past this first-bubble without the blues - we only hurt ourselves with rules for the environment that are impractical for the first-bubble. So all things considered, if we were on a traditional platform we would see months pass for the relative weeks it would take for a Netezza machine to do the same work.
Alas, when one has a Netezza machine, it bends gravity and dilates time. Months become weeks. Weeks become days. And yet, we still need more power. More is never enough to wash away the blues.
Those first-bubble-blues.

