Monday, October 26, 2015

Importance of setting the right distribution key for improved performance in Netezza

1- try create table table_name ( field1 bigint, field2 varchar(10))distribute on random

insert into table_name ()

generate statistics on table_name

the key is controlling your data types. if your recordset is large varchar fields, netezza is going to struggle for speed. if you can take your number fields and cast them as integer types, you will see better performance. All your varchar fields need to be "reasonable" (don't use varchar(5000) on every field.

2 -

Perhaps the table statistics are not updated.

generate express statistics on table_name;

3- last option:

What is the distrubution key on the existing table? Perhaps that is driving your problem, in that the table has high skew which means you are not distributing across the box efficiently. If you own the table, you may want to rebuild with a better distribution key, keeping in mind that it will be slow, but the future performance of the table will be better.

4- Same distribution key is required :
If you use a CTAS without distribution it will default to the same distribution as the source table, which may be more efficient.


Source

No comments:

Post a Comment