Identifying skewness in joins
Skewness is the system killer. The magic of Teradata is in its parallelism, which distributes the work/data across many processing elements; this magic can turn into mush if the work/data is distributed in an uneven or disproportionate manner. Skew is when one or more of the Access Module Processors (AMPs) get a larger than average share of the work.
We need to understand that an absolute even distribution is rarely achievable on a single query event. It is recommended not to consider the operation skewed until the portion consumed by the hot AMP exceeds four to five times the average.Â
Whatever kind of skewness there is on a system, it reduces and degrades system parallelism. When skewness occurs in a query, it slows down the join processing, and for that reason joining does not occur with full efficiency, which in turn consumes more CPU and runtime for the query.
The distribution of rows directly affects the benefits of parallelism. The more uniform the distribution...