Skip navigation
15546 Views 4 Replies Latest reply: Feb 24, 2010 1:50 PM by David Birmingham RSS
playfair1 New Enzee 1 posts since
Oct 4, 2009
Currently Being Moderated

Feb 24, 2010 12:04 PM

Exact Numeric Data Types - what precision to use?

We have a billion row standard fact table (integer/bigint surrogate keys, fact columns are all in numeric(p,s) format).

 

The NZ document recommends:

Always use the smallest integer or fixed-point numeric whenever possible. When converting source data to the NPS system,

you may need to analyze the data to determine the smallest data type that you can use.

 

 

However, some of our fact columns may contain currency, e.g. Japanese yen, US $ etc.

To make the design more generic, we would like to use the largest precision for fixed point numeric whenever possible.

 

I did some experiments with Numeric(9,2), Numeric(18,2), Numeric(38,2) on Twinfin. It turns out the disk space is really

determined by the underlying data that are stored in those columns. Changing data types from 4 to 8 to 16 bytes fixed point numeric

almost has no impact on the actual data storage (compression is on by default).

 

Any recommendations on how to pick the "generic" precision for fixed-point numeric while preserving the disk consumption (potentially positive

impact on query performance)?

  • David Birmingham Active Enzee 426 posts since
    Sep 24, 2007
    Currently Being Moderated
    1. Feb 24, 2010 1:00 PM (in response to playfair1)
    Re: Exact Numeric Data Types - what precision to use?

    Well, not to split hairs or anything, but if you can do a smaller integer it will alway save the disk space over a larger one, and each additional byte in a billion-row table is a gigabyte of storage.

     

    The necessary precision should not be driven by Netezza's constraints but by your business rules. So if you need the additional widht or precision, just do it. The compression will cover you either way.

     

    Another aspect is that float does not compress. So stay in the numeric if at all possible (these are just scaled integers anyhow).

     

    Another aspect of your data-size testing - you need to use a very large sample to test with because Netezza has a default-minimum database size that will be allocated no matter what the size of the table, so you could get a false positive (that they all appear the same) because your sample is too small.

  • Shawn Fox Enzee Exraordinaire 1,333 posts since
    Aug 15, 2006
    Currently Being Moderated
    2. Feb 24, 2010 1:39 PM (in response to playfair1)
    Re: Exact Numeric Data Types - what precision to use?

    I would stick with NUMERIC(18,X) if possible.  A NUMERIC(18,X) uses 8 bytes whereas anything bigger uses 16 bytes.  The net result is that NUMERIC(18) uses significantly less CPU / FPGA resources since it is half the size of a NUMERIC(38,X).  Also you should see lower disk usage for NUMERIC(18,X) vs. NUMERIC(38,X).  Compression will remove most of the difference but not all.

     

    In the past I have usually used NUMERIC(18,4) to store monetary values but that was due to converting all money to a single currency (and storing the original conversion factor along with the fact data so that I could undo the conversion if needed).

     

    Unless you are processing Zimbabwe currency or the US national debt a NUMERIC(18,X) should be plenty.

  • David Birmingham Active Enzee 426 posts since
    Sep 24, 2007
    Currently Being Moderated
    3. Feb 24, 2010 1:48 PM (in response to Shawn Fox)
    Re: Exact Numeric Data Types - what precision to use?

    The only thing you would need to test, past all this - is to make sure that your BI math doesn't spontaneously combust because of overflow. Don't jump to convert them to float until you read my blog entry A Tastier Float here on the Netezza community (Gather 'Round the Grill)

  • David Birmingham Active Enzee 426 posts since
    Sep 24, 2007
    Currently Being Moderated
    4. Feb 24, 2010 1:50 PM (in response to Shawn Fox)
    Re: Exact Numeric Data Types - what precision to use?

    <duplicate>

More Like This

  • Retrieving data ...

Bookmarked By (0)

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points